Introduction to Differential Topology 
Matthew G. Brin 


Department of Mathematical Sciences 
State University of New York at Binghamton 
Binghamton, NY 13902-6000 


Spring, 1994 


CONTENTS 


. Introduction 
. Basics ‘ 
. Derivative and Chain rile in + Bacidean spaces 


Three derivatives 


. Higher derivatives . , 

. The full definition of differedtiable <Aaninld 

. The tangent space of a manifold 

. The Inverse Function Theorem . 

. The C” category and diffeomorphisms 

. Vector fields and flows : 
. Consequences of the Inverse Panetibii aThedredi 
. Submanifolds , 

. Bump functions and partitions of nity 

. The C! metric : 

. The tangent space over a eccndinais satel: 

. Approximations 

. Sard’s theorem 

. Transversality . 

. Manifolds with boundary. 


NN WN 


0. Introduction. 


This is a quick set of notes on basic differential topology. It gets sketchier as it 
goes on. The last few sections are only to introduce the terminology and some of 
the concepts. These notes were written faster than I can read and may make no 
sense in spots. Were I to do them again, the first few topics would be rearranged 
into a different order. I am told that there are many misprints. 

The notes were designed to give a quick and dirty, half semester introduction 
to differential topology to students that had finished going through almost all of 
Topology: A first course by James R. Munkres. There are references to this book 
as “Munkres” in these notes. The notes were written so that all of the material 
could be presented by the students in class. This explains various exhortations to 
“presenters” that occur periodically throughout the notes. 

I cribbed from three main sources: 


(1) Serge Lang, Differential manifolds, Addison Wesley, 1972, 
(2) Morris W. Hirsch, Differential topology, Springer-Verlag, 1976, and 
(3) Michael Spivak, Calculus on manifolds, Benjamin, 1965. 


The last is a particularly pretty book that unfortunately seems to be out of print. 
I also stole from a few pages in 


(4) James R. Munkres, Elementary differential topology, Princeton, 1966 


whose title does not mean what it seems to mean. I do not identify the sources 
for the various pieces that show up in the notes. Other sources that might be 
interesting are 


(5) Th. Brécker & K. Janich, Introduction to differential topology, Cambridge, 
1982, 

(6) John W. Milnor, Topology from the differentiable viewpoint, Virginia, 1965, 
and 

(7) Andrew Wallace, Differential topology: first steps, Benjamin, 1968. 


Milnor’s book covers an amazing amount of ground in remarkably few pages. Wal- 
lace’s takes an independent path and sets some of the machinery needed for discus- 
sion of surgery on manifolds. 


1. Basics. 


Let U be an open subset of R™. Let f : U > R” bea map. Note that for 
each « € U we have that f(«#) is an element of R” so that f(x) is an n-tuple or 
f(x) = (fi(z),...,fn(x)). The functions f;(x) are the coordinate functions of f. 
Note that each « € U is an m-tuple and can be written x = (71,...,%m). 

We can now write down the partial derivatives of f if they exist. They are the 
derivatives 

Ofi 
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We say that f is differentiable of class C! (short for continuous first derivatives) 
or just that f is C! if all of the first partial derivatives exist and are continuous 
at all points of U. We say that f is smooth or differentiable of class C™ or just 
C@ if all partial derivatives of all orders exist and are continuous at all points 
of U. (We define C” by requiring that partial derivatives up to order r exist 
and be continuous. We can even define class C° by just requiring that the func- 
tion f be continuous and make no mention of derivatives.) Later, we will replace 
the definition of C! by another one that is not tied to the calculation of partial 
derivatives. 

We can now try to apply these definitions to spaces that are modeled on Euclidean 
spaces — namely manifolds. 

Recall the definition of an n-manifold. We say that M is an n-manifold if M 
is a separable, metric space so that every point « € M has a neighborhood U 
in M with a homeomorphism 6y : U + R”. Note that the homeomorphism 0y 
gives each point y € U a set of coordinate values (by reading off the coordinates 
of Oy(y) in R”). Thus the functions 9y are called coordinate functions. The 
open set U is called a coordinate patch. Note that the coordinate patches form 
an open cover of M. (We will sometimes refer to the pair (U, 6) as a coordinate 
chart.) An alternative wording for the definition of an n-manifold is that it is a 
separable, metric space with an open cover of sets homeomorphic to R”. Note that 
the topology of M is determined by the open cover in that a set A C M is open 
in M if and only if ANU is open in U (i.e., Qy(ANU) is open in R”) for every 
U in the open cover. We will use this later in a certain situation to determine a 
topology from a cover of coordinate patches. 

Coordinate functions can be used to transfer activities taking place in one or more 
manifolds to activities taking place in one or more Euclidean spaces. Consider the 
following. 

Let M be an m-manifold, let 2 € M and let N be an n-manifold. Let f: M— 
N be amap taking x to y € N. Let U be acoordinate patch about 2 and V bea 
coordinate patch about y. Then f~!(V) is openin M and intersects U in an open 
set. Thus there are open sets W C R™ and W’ C R” so that Oy 0 f 067! is defined 
from W to W' after making suitable restrictions. Thus the function f between Mf 
and N has been turned into a function between open subsets of Euclidean spaces. 
Various phrases are attached to this process. The function Oy o f 0 Ge is said to 
be an expression of f in local coordinates or f expressed in local coordinates. 

It is tempting to say that f is C1 (or smooth or C”) at x if Oy o fo a is 
C! (or smooth or C”) and that the partial derivatives of f are just the partial 
derivatives of Oy 0 f 0 Oe . However there are problems with this that we will go 
into. The problem of consistently determining when a function f is differentiable 
requires a certain amount of work. The problem of determining exactly what the 
derivative of f should be turns out to need even more work. 

What are the problems? Consider the following homeomorphisms from R to 
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itself. Let 


Os aie 
a 2%, «x£>0. 


The space R is a 1-manifold because each x € R has a neighborhood (namely R 
itself) that is homeomorphic to R. The functions a and @ are possible choices for 
such a homeomorphism. Now let M and N be the 1-manifolds whose underlying 
space is R, where R is the only coordinate patch for each of M and N, and where 
M uses a as its coordinate function and N uses ( for its coordinate function. 
Consider the identity map f from R to itself. This can be viewed as a map from 
M to M, from M to N, from N to M and from N to N. Now we note that the 
maps ao foa! and Bo fof! are differentiable but ao fof -! and Bo foa! 
are not. Thus f is differentiable as a map from M to M and from N to N, but 
not from M to N and not from N to M. 

The problem arises now if we use both a and @ as choices for coordinate func- 
tions for a single 1-manifold. (Such choices are almost never avoidable since an 
n-manifold will usually have to be covered by overlapping open sets with homeo- 
morphisms to R”. Consider a collection of open sets that demonstrates that the 
circle is a 1-manifold.) Multiple choices of coordinate functions mean that there 
are multiple ways to express a function in local coordinates. For example, if both a 
and @ are available as coordinate functions, then the answer to the question as to 
whether the identity from R to itself is differentiable will depend on the coordinate 
functions used. We need a way to insure that a choice of coordinate functions does 
not make the question of differentiability ambiguous. 

We can now give a definition of a differentiable n-manifold. The definition of an 
n-manifold is imitated but with a couple of changes. One is for convenience, and the 
other is to make the notion of differentiability unambiguous. A separable, metric 
space M is a differentiable n-manifold of class C’ (or just a C” n-manifold), 0 < 
r < oo, if there is an open cover O of M so that each U € O has a homeomorphism 
6y :U —U' where U’ is an open subset of R” and so that for each U and V in 
O with UNV #9, 


(@v|(unv)) © (@r\uav)) :0u(UNV) > (UNV) 


is C’. The function (Ov \(vav)) ° (Ou\wavy)) is known as an overlap map. The 
definition requires that all overlap maps be C”. We will add one more condition 
later when it becomes convenient to have it and when the reasons for it become 
more apparent. The new condition will not change the definition and what we have 
so far will do. 

If we regard R as a 1-manifold and use a above as its only coordinate map, then 
R is a C® manifold. It is also a C® manifold if we use @ as its only coordinate 
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function. However, if we use both a and ( as coordinate functions, then we only 
get a C° manifold. 

We can now attack the idea of differentiable function between C” manifolds. 
Almost as before, let M be a C” m-manifold, let x € M, let N be a C” n- 
manifold, let f : M@ — N be a map taking x to y € N, let U be a coordinate 
patch about xz, and let V be a coordinate patch about y. We say that f is 
differentiable of class C*, s <r, at x if @y ofo On (with suitable restrictions) is 
a C*® map from an open set in R™ containing 6y(x) to an open set in R”. We 
say that f is differentiable of class C* if f is differentiable of class C*® at every 
rEeM. 

We accept as a temporary black box: A composition of C™ maps between open 
sets in Euclidean spaces is C™. We use this to verify: Whether the function f of 
the previous paragraph is discovered to be C’* at x is independent of the coordinate 
patches and functions used. [Presenters: Check it out.] Thus a function is C® if 
every expression of f in local coordinates is C*. 

The actual derivative of a differentiable function is another matter. Consider 
R as a 1-manifold with 6;(z) = x and 69(#) = 2a as the available coordinate 
functions. It is easily checked that the (only two) overlap maps are C®. Thus R 
with these coordinate functions is a C° 1-manifold. Now consider the identity 
function f from R to itself. We might consider 9, o f o Ges or 6,0 fo O55 or 
62,0 fo Oe , or O20 fo Oe to try to discuss the derivative of f at a given point. 
However, the four expressions above give three possbible candidates for the value 
of f' at any given point. 

An attempt can be made to get around this in the same way that we got around 
ambiguities in the notion of differentiability. We could try to restrict the overlap 
maps even further. The requirement could be that the overlap maps introduce no 
stretching. This can be done but it turns out to be incredibly restrictive. Some 
manifolds, such as S! and products of S! with itself, can be given such structures, 
but infinitely many others can not. Another approach is used. 

The calculation of derivative for functions from R”™ to R” make use of the fact 
that Euclidean spaces are vector spaces and that a “calculus of displacement” is 
available. Displacement is done with vectors. Vectors have the properties of length 
and direction which can be exploited. In a manifold, the notions of length and 
direction are handled by tools that can be adapted to the manifold and that don’t 
depend on a notion of straightness. Specifically, we will use curves — differentiable 
functions from R to the manifold. If we knew what the derivative of a curve 
was, then we would say that the derivative at a point was giving us a direction 
and speed (the norm of the derivative) was giving a length. It turns out that a 
workable system can be invented even if the derivative of a curve is not known. All 
you need to know is when two curves “deserve the same derivative” and how to 
form equivalence classes. 

As preparation, we review derivatives of curves into R”. Let f : R —> R” 
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have coordinate functions (fi,..., fn). Then f' = (f],...,f%) and, for a given z, 
f'(z) = (fi (2),..., f,(@)) which is regarded as a vector that is tangent to the curve 
f at f(x). For example, the straight line tangent to f at f(x) can be formed as 
T(t) = f(x) + t(f'(a)). The point of tangency is at T(0) = f(z). 

We are now ready for some definitions. Let M be a C” n-manifold, r > 1, 
let « € M and let U be a coordinate patch containing «. Let C(«) be the set 
of all f : V + U so that V C R is open, 0 € V, f is C! and f(0) = zx. 
(Why is C(x) not empty?) We define a relation on C(x) by saying that f ~ g 
if (@y o f)'(0) = (Ay o g)'(0). [Presenters: show that this does not depend on the 
coordinate patch U , and show that this is an equivalence relation. This assumes a 
chain rule for maps between open subsets of Euclidean space. Such a chain rule is 
written out in the next section.] 

We define T, to be the set of equivalence classes and call it the the tangent space 
to M at x. Elements of T, are called tangent vectors at x. Of course, the word 
“vector” is not yet justified. 

We note that 6y :T, 3 R” defined by [f] + (@y 0 f)/(0) is well defined and one 
to one because of the way the classes of T, are defined. We claim that it is also a 
surjection. Let d be a vector in R”. We can form the straight line 1: R — R” by 
I(t) = 6y(x) + td. There is an open set V in R containing 0 so that f = 6,101 
is defined on V. Also, f(0) =x and f is Ct since 0y o f =1 is C'. (In the last 
claim, we used the identity coordinate function from R to itself in regarding R as 
a 1-manifold.) Now 6y[f] =1'(0) =, so 6y is onto. 

We now have a bijection 6y between T, and the vector space R”. We can use 
this to define a vector space structure on T, by saying that [f]+[g] = 65° (6u[f] + 
6y[g]) and r[f] = 6 (ru [f]). Not only does this give us a vector space structure 
on T, but it makes 6y an isomorphism. We will make use of this isomorphism 
later, so it is worth summarizing in a lemma. 


LEMMA 1.1. Let 67 : U > R” be a coordinate function and x € U. Then 
Oy :T, + R” defined by [f] + (@y ° f)'(0) is an isomorphism. 


Let M be a C™ m-manifold and let N be a C*® n-manifold, r and s at least 
1. We are now ready to talk derivatives. Let f : M— N bea C! map. Let x be 
in M with y = f(x). We will define a function from T, to Ty. Let g be a curve 
representing a tangent vector at x. Then we define Df,([g]) = [fog]. [Presenters: 
this is well defined and is a linear function from the vector space T, to the vector 
space Ty .] 


PROPOSITION 1.2 (THE CHAIN RULE). Let M, N and P be differentiable man- 
ifolds of class at least C!. Let f : M — N and h: N — P be differentiable of 
class at least C!. Let x € M and let y = f(x). Then D(ho f)» = (Dhy)°(Dfz). 


PrRoorF: [Presenters: ... |] 


The chain rule is actually one step in a construction designed to make the deriva- 
tive a functor. It is not very interesting when applied only to the tangent space 
at one point, but it is a start. The other half of this start is the following trivial 
lemma. 


LEMMA 1.3. Let M be aC” m-manifold, r > 1, and let i: M — M be the 
identity map. Then for any x € M, Di, :T, — T, is the identity. 


COROLLARY 1.3.1. Let M and N be C”™ m-manifolds, r > 1, and let h bea 
C! homeomorphism between them whose inverse is C!. Then for any x € M, 
Dhz : Tx 4 The) is an isomorphism. 


The approach taken here is not the only approach to tangent vectors and tangent 
spaces. There are at least three approaches (and possibly more) that appear quite 
different, but which give structures with identical behavior. 

The next topic will fill in the black box mentioned above: compositions of C” 
maps between open sets in Euclidean spaces are C” maps. Even further, we will 
derive a chain rule for maps between Euclidean spaces. This will then be used to 
put a structure on the collection of all T,, « € M. 


2. Derivative and Chain rule in Euclidean spaces. 
If f: R > R is a function, then its derivative at x is defined by 


h-0 h 


If we try to generalize to functions f :R” — R”, then we run into the problem of 
dividing by a vector. 

If we return to the case of f : R > R, then the definition of derivative can be 
reinterpreted to say that f is differentiable at x and that its derivative at x has 
the value f'(x) if 

— _— / 
jam LEFM=SO=SWh _ 
h-0 h 
The function h + f'(z)A is a linear function from R to R. If we call this linear 
function », then we have that f is differentiable at x if there is a linear function 


A:R—-R so that Fe Sia, 

fan LEFF) — MA) _ 

h-0 h 
The number /f’(x) is just the slope of the linear function A. Instead of defining 
the derivative of f at x to be the slope of the linear function we can define the 
derivative of f at x to be the linear function \ itself. This gives a setting that can 
be imitated in higher dimensions. Note that since the definition involves a limit 
at a specific point, we only need to have f defined on an open set containing the 
point. This will be reflected in the setting of the defintion. 
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Let f : U + R” be a function where U is an open subset of R™. We say that 
f is differentiable at x € U if there is a linear function \: R™ — R” so that 


sion We +A) = F(x) = ACP) 


= 0. 
h—0 || All| 
We could also say 
sin LETH) = Fw) = Mb) _ 
h—0 || Al| 


since a vector goes to zero if and only if its length goes to zero. We say that 
the derivative of f at x is \ and denote it Df,. The quotients make sense 
since the denominators are real numbers. Note that the “domain” of the limit is 
U—2a = {u—a|u € U} which is the translation of the open set U that carries x to 
0 and is thus an open set in R™ containing 0. In (e,6) form, the limit statement 
reads: for any € > 0, there is a 6 > 0 so that for any h £0 in the 6-ball about 0 
in R™, we have that 

Ilf(@ + h) — f(x) — AAI 


<eé. 
Il | 


Or, in other words, 
I|f(@ +h) — f(a) — A(A)II < ell]. 


PROPOSITION 2.1. Let f: U > R” be differentiable at x where U is an open set 
in R™. Then Df, is unique. 


PROOF: Suppose that linear A4;: R™ > R”, 1 = 1,2 both satisfy 


sin Wet) = F(x) = iC) 


h—0 || Al| < 


Thus for « > 0 and restriction of h to a suitable 6-ball we can make 
€ 
I|F(e +h) — Fle) — A(A)IL < Sllrll- 
Now, 


[|A (2) — Ao(h) I] = [Ar (h) — F(a +h) + f(a) + F@ +h) — Fe) — r2(A)l 
SA (h) — fla +h) + F(a) + WF (@ +b) — Fe) — A2(A)Il 
< éllAll- 


This gives the not surprising statement that the \; do not differ by much on small 
vectors. But the ; are linear and we can use this and the inequality above to show 
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that they do not differ by much on any vector. Let v € R™ be arbitrary and let 
t > 0 be small enough so that tv is in the 6-ball. Then 


tello|| = ellto| 
> |i (tv) — A2(tv) | 
= |ltA1(v) — #2 (ev) 
= tl|A1(v) — A2(¥)IL- 


So 
I|A1(v) — A2(v)I| < elle]. 


But this can be done for this v and any «€ > 0. So |[Ay(v) — Ao(v)|| = 0 and 
Ay = ro. 

The next result, the chain rule, fills in the “black box” from the previous section. 
In its proof, we will need the continuity of certain linear functions. This is straight- 
forward but not trivial in the finite dimensional setting that we are in if we use the 
usual topology on the Euclidean spaces. It is false in infinite dimensions for most 
topologies that are put on the vector spaces. 

We will need the notion of the norm of a linear map. Let 1: R™ > R” bea 
linear map. Let B be the closed unit ball in R™ and let ||A|| be the maximum 
distance from 0 to a point in f(B). This exists and is finite since B is compact. 
It may be zero if f is the zero linear map. Let v € R™. We have the following 
inequality: 

v 


| < Tell IAM 
ra) 


The finiteness of ||\|| depends on the continuity of X. As mentioned above, linear 
maps with finite dimensional domains are continuous. In an infinite dimensional 
setting, the finiteness of ||A|| is equivalent to the continuity of . 


HAI = lel: Ia ( 


THEOREM 2.2 (CHAIN RULE ON EUCLIDEAN SPACES). If U C R™ and V C R” 
are open sets and f : U — R” and g: V > R? are differentiable at a € U and 
b= f(a) €V respectively, then go f : U + R? is differentiable at a and 


D(g° f)a = Dg ° D fa. 


ProoFr: Another way to interpret the definition of the derivative of f at x is to 
say that if we define 


E(h) = f(@ +h) — f(x) — Df2(h), 


then for any € > 0, there is a 6 > 0 so that ||h|| <6 implies ||E(h)| < e||A||. Note 
that E(0) = 0 so that we do not have to say 0 < ||Al| <6. 
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Let A= Df, and up = Dg. We have 


Ilg( f(a + h)) — 9 F(2) 
< |lo(f(x) + a 


where the equality follows from the linearity of . We will be done if for a given 
€ > 0 we can find a 6 > 0 so that ||h|| <6 makes 


(1) Mg (Fe) + ACh) + B(A)) — gL F(@)) — HOCH) + B(A))II < Sh 
and 
(2) (Z@))I < SIA 

We have 


lIg(f(@) + A(h) + E(h)) — g(F(@)) — KAA) + B(h))|| < ex||A(h) + B(A)I 


if 
(3) |A(h) + E(A)I| < 41. 
Now 
|A(h) + E(A)I] < ACI + EOI 
(4) < |All All + lee lll 
= (|All + €2) |lAll 
for 
(5) IA|| < de 
so 
ei ||A(h) + E(A)I| < (erl/All + eres) IIAll 
€ 
—||h 
< Sil 
if all of 
(6) fe ye Pail 
Ey 4? €] AIAI]’ €2 
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hold. Thus we get (1) if we can satisfy all of (6). Now 


We(Z(A))I S Well WEI 


< €2[|ul| - [|All 
€ 
=||h 
< Si 
if 
€ 
(7) eQ< >. 
2||H4l 


Thus we get (2) if we can satisfy (7). 

So given €, we determine €, and €2 from (6) and (7). This determines 6, and 
d9 which puts our first restriction 6 < 6) on 6 because of (5). We must deal with 
(3). But we can get this from (4) by putting the resriction 


61 
6< —— 
JAI] + €2 


on 6. This finishes the proof. 


We give two easily computed derivatives. 


LEMMA 2.3. Let f : R™ — R” be a linear mapping. Then for all x € R™, 
D fx _ ¥: 
Proor: With f linear, f(a +h) = f(x) + f(h) so 


jim £608) = F(0) - F(R) 


= 0. 
h—0 || Al 


Since we need a linear function of h that gives the above limit and the linear f 
does the trick, f must be the derivative. 


LEMMA 2.4. If f is a constant, then all Df, are the zero tranformation. 


PROOF: The linear map 0 works in 


fan £2 +h) ~ F(z) —0(h) _ 
h—0 || Al 

We end with a lemma that we will use to relate two of the notions of derivative 
that we have used so far. We assume the usual notation that if a: A — C and 
8: B— D are functions, then the notation a x @ refers to the function from Ax B 
to C x D defined by (a x 3)(a,b) = (a(a), G(b)). We also invent a notation that 
ify: A> B and 6: AC are given, then (7,6) refers to the function from A 
to Bx C defined by (7,6)(a) = (y(a), 6(a)). 
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LEMMA 2.5. If U € R™ and V € R* are open sets and f : U — R” and 
g: V — R° are differentiable at a € U and b € V respectively, then f x g : 
UxV +R" xR’ is differentiable at (a,b) and the derivative there is Df, x Dgy. 
If, in addition, h: U > R# is differentiable at a, then (f,h) is differentiable at a 
and the derivative there is (Dfa,Dha)-. 


PROOF: Consider 


II(f x g)(a+ hi,b +h) — (f x g)(a,b) — (Dfa x Das) (ha, he2)Il 
=||(f(a + hi), g(b + he)) — (F(a), 9(0)) — (Dali), Doo (h2)) || 
(8) =||(f(a+ hi) — f(@) — Dfa(hi) , g(b + hz) — g(b) — Dgo(h2))|I. 


The i-th coordinate, i = 1,2, in (8) can be kept less than e||h,|| by confining h; 
to some 6;-ball. So if 


I|(fa, h2)|| = max{||hal, [[hal]} < min{d1, 42}, 
then both coordinates in (8) are less than 
emax{||/al], ||h2ll]} = €ll(1, he) Il. 


This proves the first part. 

Now consider the diagonal map d: U > R™ xR” defined by d(w) = (u,u). This 
is linear so Dd = d. Note that (f,h) = (f x h)od. Now D(f,h) = D(f x h)oDd= 
(Df x Dh)od=(Df,Dh). 

We can use this to relate the standard notion of the derivative of a curve, to the 
notion of a derivative as developed in this section. Recall that if f is a function 
from R to R, then f'(z) gives the slope of Df,. Thus for f and g from R to R, 
we have f'(x) = g'(x) if and only if Df, = Dg,. Even more, we can recover f'(zx) 
from Df,. Since f’(x) is the slope of the linear map Df, : R — R, we must have 
f'(a) = Df(1). 

Now if we have f: R > R”, we have f = (f1,..., fn). By Lemma 2.5, we have 
Df =(Dfi,...,Dfn). If g:R > R” is given, then we also have f'(x) = g'(x) if 
and only if Df, = Dg». And further, 


f'(z) = (fi(@),--- f,(2)) 
= (D(fi)e(1),--.,D(fn)x(1)) 
= Df,(1). 
Going back to the setting of Section 1, we can now say that two curves f and g 
represent the same tangent vector if D(9y ° f)o = D(9u °g)o. 


We leave as easy exercises the fact that the derivative is a linear operator on 
functions. Specifically, D(f + 9)x = Dfy + Dg, and D(rf), =rD fe. 
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3. Three derivatives. 


We have been exposed to three kinds of derivatives. One is the usual Calculus 
LIII derivative and has shown up in 


for a function from R to R, and in 


(fiss-+s fn)! = (fis fa) 


for afunction from R to R”. The second kind is the “advanced calculus” derivative 
defined in the previous section as the best linear approximation to a function from 
R”™ to R”. The third kind was defined in the first section as a linear function on a 
tangent space. We would like to combine these three notions as much as possible, 
expecially as we have used the same notation Df, for the last two of them. Because 
of this, we will agree for this section only to use D for the “advanced calculus” 
derivative (best linear approximation). 

The use of f’ has only been used in these notes to define classes of curves to build 
tangent spaces and for the isomorphism of Lemma 1.1. In the previous section, we 
showed that the use of f’ can be eliminated from definition of classes in tangent 
spaces. That still leaves the use of f’ in the isomorphism of Lemma, 1.1. We will 
try to eliminate as many references to f’ as possible by filtering all such references 
through an application of Lemma 1.1. 

We now concentrate on D and D. We cannot eliminate D since it is essential in 
defining the notion of differentiable for functions between Euclidean spaces. How- 
ever, what we can aim for is to show such a strong equivalence between D and D 
that distinctions between them become unimportant. 

Here is the first lemma to try to blur some distinctions. 


LEMMA 3.1. Let U C R™ be an open set with uc U. Let f :U > R” be C! 
and let v = f(u). Let i: U — R™ be inclusion and let 7 : R” — R” be the 
identity. In the following diagram, i and j are the isomorphisms of Lemma 1.1. 


{ —H aR 
Dt.| |n=ieDhoi 
T, —1.R” 


If h is defined as shown in the diagram, then h = Df,. 
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Proor: We consider (j o Df, 01~!)(d) for some din R™. We start with i—!(d). 
For 1: R > R” defined by I(t) = i(u)+td = u+td, we have i~+(d) = [i-' ol] = [I]. 
So 


This says that the two notions of derivative behave the same for functions between 
Euclidean spaces. Now we bring in manifolds. In the statement we simplify the 
notation for the coordinate function on a patch U by dropping the subscript U 
and write @ instead of 9y. This is to keep the notation from exploding. 


LEMMA 3.2. Let U be acoordinate patch in a C™ m-manifold M with coordinate 
function 6 and let ue U. Let V = @(U) regarded as an m-manifold with one 
coordinate patch V whose coordinate function is the inclusion map 1: V > R™. 
Then the following is a commutative diagram of isomorphisms. 


Ty 6 Rm 


Se 


To (u) 


PRoor: We know from Lemma 1.1 that 4 and i are isomorphisms. If the diagram 
commutes, then Dé, will be an isomorphism. To see that the diagram commutes, 
let [f] be in T,,. We have 6[f] = (40 f)'(0). Now D6, [f] = [00 f] and i[9 0 f] = 
(1080 f)'(0) = Go f)'(0). 

The next lemma looks at maps between manfiolds. Again we leave subscripts off 
the coordinate functions. 


LEMMA 3.3. Let M be an m-manifold and N be an n-manifold, each of class at 
least 1. Let f : M4 N bea C! map and let u€ M with v = f(u). Let U bea 
coordinate patch around u with coordinate function 0 and let V be a coordinate 
patch around v with coordinate function ¢. To avoid restrictions, assume that 
f(U) CV and use this to define h = 60 f06-!. Let i and j be the inclusions 
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of @(U) and ¢(V) respectively into R™ and R”. Then the following diagram 
commutes and the non-vertical arrows are isomorphisms. 


T, —__? rR” 
= 
To(u) 
Dfu [Prov Dho(u) 
Tow) 
Dox = 


PROOF: The isomorphisms and the commutativity of all but the left hand trapezoid 
follow from the previous two lemmas. The commutativity of the left hand trapezoid 
follows from the chain rule. 


There are three main quadrilaterals in the diagram of Lemma 3.3 — the outer 
square and the two trapezoids. Each can be interpreted in words. The outer square 
says that when h is an expression of f in local coordinates, then the isomorphisms 
induced by the coordinate functions used in the expression conjugate the action of 
Df on the tangent spaces to the action of Dh as a linear map between Euclidean 
spaces. The two trapezoids say almost identical things in slightly different settings. 

At this point the notation D ends. Even though there are two different notions 
of derivative that will have the same notation, the ambiguity will not be important. 


4. Higher derivatives. 


We give one more section that concentrates on maps between Euclidean spaces. 

I’m trying as hard as I can to avoid partial derivatives. Before partial derivatives 
make an appearance, we have that if f : R™ — R” is differentiable at x, then the 
derivative Df, at x is a linear map from R™ to R”. Further if f is differentiable 
on all points in R™, then we have a function Df from R”™ to the set of linear 
transformations from R™ to R”. We can call this function the derivative of f. If 
we stop here, then partial derivatives have not been brought in. They are brought 
in if we try to make the set of linear transformations from R™ to R” look more 
familiar. 

In order to make the set of linear transformations from R™ to R” look more 
familiar, we need to choose a prefered basis for both R™ and R”. If we choose the 
standard bases (unit vectors in the coordinate directions), then a linear transforma- 
tion from R™ to R”™ is represented by an n x m matrix. At this point the partial 
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derivatives have appeared. This is because the particular matrix that represents 
Df, using the standard bases is the matrix whose entries are 


Of 


(Dfz)ig = 5 


if we regard the matrix as acting on the left and we regard elements of R™ and 
R” as column vectors. We drop the partial derivatives for several paragraphs to 
inspect the structure that we have built so far. 

We have that Df is a function from R”™ to the set of linear transformation 
from R™ to R”. With our choice of bases, we have a particular one to one 
correspondence between the set of linear transformations from R”™ to R” and the 
set of n x m matrices. Thus our choice of basis allows us to look at Df as a 
function from R”™ to the set of n x m matrices. 

We can add extra structure to the set of n x m matrices and make a topological 
space and a vector space out of it. This can be done by letting basis vectors for the 
set of n x m matrices be those n x m matrices with a one in a single position and 
zeros everywhere else. This (second) choice now makes Df a function from R™ to 
Re", 

Now that Df is a function between Euclidean spaces, we can discuss two things 
— the continuity of Df and the differentiability of Df. If Df is continuous, then 
f is of class C!. If Df is differentiable, then its derivative D?f is a function from 


R” to R””’ . We see that we can now discuss higher derivatives and higher classes 
of differentiability. In particular, we can point out that f is of class C” if and only 
if Df is of class C’—!. 

Note that linear functions are infinitely differentiable. In fact, if f is linear, then 
Df. = f for all x so that Df is a constant (even though each Df, is not the 
constant linear transformation). Now all higer derivatives of f are zero. 

The fact that linear functions are infinitely differentiable is relevant because 
choices were made in setting up Df as a function from R™ to R””. The corre- 
spondence depended on two choices of bases. Different choices of bases give different 
correspondences that can be obtained from the original by multiplying by “change 
of basis” matrices at appropriate places. Multiplying by matrices is linear and thus 
infinitley differentiable. From this it follows that if f is C” as measured with one 
choice of bases, then it is as measured with another. 

We now return to the partial derivatives. Our choice of bases made Df a func- 
tion from R™ to R"™”. The coordinates in R”” are the entries in the matrices 
that represent the linear transformations Df,. These entries are just the partial 
derivatives of f at x. Thus the coordinate functions of Df are the partial deriva- 
tives. This means that a C! function f has continuous partial derivatives and a 
C” function f has partial derivatives of class C’~!. 

There are converses to this (continuous partial derivatives imply continuously 
differentiable) but we will not go into this. This might leave a hole a couple of 
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sections down the way. There are proofs of this converse in various books on 
advanced calculus. 


5. The full definition of differentiable manifold. 


It is now as good a time as any to finish the definition of a differentiable manifold. 
In discussions that will come up sooner or later, it will be convenient to introduce 
more flexibility into our choice of coordinate charts. The addition to the definition 
will give us this flexibility. We have already seen the need for the flexibility in the 
statement of Lemma 3.3 where we assumed that one coordinate patch mapped into 
another in order to avoid having to mess up the notation with restrictions. 

Our current definition of a C” m-manifold is that it is a separable, metric space 
with an open cover of coordinate patches that have C” overlap maps. We now 
shift our focus from coordinate patches (the domains of the coordinate functions) 
to coordinate charts (the domains of the coordinate functions together with the 
coordinate functions). (Our distinction between coordinate patches and coordinate 
charts is not exactly standard.) We now define a C’ m-manifold to be a separable, 
metric space with a collection of coordinate charts {(U,@)} where @ is a homeo- 
morphism from U to an open subset of R™. We drop the subscript from @ since 
we no longer regard @ as determined by U. In fact, there may be many coordinate 
functions with the same domain. We put three conditions on the collection of coor- 
dinate charts. The first two are already familiar. 1: The domains of the coordinate 
functions shall form an open cover of M. 2: The overlap maps shall be C”. 3: 
The collection of coordinate charts shall be maximal with respect to conditions 1 
and 2. The collection of coordinate charts is called the differential structure for the 
manifold. 

Condition 3 seems as though it might introduce some ambiguity as to what the 
collection of charts should be. This is not the case. Let A be a collection of 
coordinate charts on M that satisfies 1 and 2 but not 3. Let B be a collection of 
coordinate charts on M that satisfy nothing in particular. It turns out that in order 
to tell if AU B is a collection that satisfies 1 and 2, it is only necessary to check, 
for each chart (U,9) in B, that all overlap maps involving (U,@) and a chart in A 
are C’. [Presenters: ... .] Thus the “admissibility” of B as a possible addition to 
A depends only on the individual charts in B and not on any properties of B asa 
collection. Thus a maximal collection based on A is obtained by throwing in any 
chart whose overlap maps with the charts of A are C”. 

This has several consequences. The first consequence discusses how little infor- 
mation is needed to determine the structure on a manifold. Let C be a collection 
of coordinate charts satisfying 1 and 2. Let A and B be subcollections of C’ that 
also satisfy 1 and 2. All the charts in C are compatible with A and also with B. 
Thus if we start with only A and maximize to obtain 3, we will add all the charts 
originally in C’. Similarly, if we start with only B and maximize to obtain 3, we 
will add all the charts originally in C.. Thus, the differential structure on a manifold 
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is determined by the class of differentiability desired and by any subcollection of 
charts of the differentiable structure whose domains cover the manifold. 

The second consequence discusses the richness of charts available. Let M be a 
C” m-manifold and let x be a point in an open set E of M and let (U,@) bea 
coordinate chart with « € U. But now (UN E,4|unz) is a valid coordinate chart. 
If it were not in the collection of charts, then its overlap maps with all existing 
charts would just be restrictions of existing overlap maps and would be C’. By 
maximality, it must be in the collection of charts. This is the last time we will 
repeat this argument. 

Now, instead of working with 6|uqz, we will just assume that 9|yqg has replaced 
6 and that U C E. We will do further replacements introduced by the code words 
“we now assume” to improve things even more. Now @(x) € 6(U) and @(U) is an 
open set in R™. There is an open e-box 


DEA Licht) UE Cee 0s DEe= Gy = ey LAS a 


in 6(U) with @(2) = ((b1 — a@)/2,...,(bm — @m)/2) at its center. By restricting 
6 to 6-1(D), we now assume that 0(U) = D. There is a C® homeomorphism 
taking D to R™. This can be done in several steps. First take D to the open 
€-box centered at the origin by translating 9(x) to the origin. Then dilate by m/e 
to get to [—7/2,7/2]™. Now take [—7/2,7/2]™ to R™ by taking (a,...,%m) to 
(tan(z,),...,tan(z,,)). The tangent function is C® and has C® inverse. Thus we 
can now assume that the coordinate function takes U to all of R™. What we have 
shown is that every point has arbitrarily small neighborhoods that are domains of 
charts whose image is all of R”™. 

We can combine our two consequences and say that every differentiable structure 
has charts whose images are all R™ and whose domains contain a neighborhood 
base for every point in the manifold. 


6. The tangent space of a manifold. 


Let M bea C” m-manifold and let TM be the union of all the T,, for x € M. 
We want to define a structure on TM. This means two things. We want to define 
a topology on TM. But the current subject is differentiable manifolds. So we also 
want to define a set of differentiable coordinate patches that cover TZ. When we 
have done so, we will have defined the tangent space of the manifold M. 

It is possible to spend an infinite amount of time on the tangent space. I want 
to avoid that. We will see to what extent I succeed. 

Since each T, in TM is a vector space isomorphic to R”™, it is tempting to 
associate TM with Mx R”™. However, this turns out not to be the right structure 
in general. For a subset U C M, we can define TU to be the union of all the T,,, 
for « € S. When U is a coordinate patch, then U x R™ does turn out to be the 
right structure for TU. From this, the right structure for TM will follow. 
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There are two possible approaches toward proving that the structure for TU is 
UxR”™ when U isa coordinate patch of M. One is to come up with a mathematical 
reason as to why this is so. The other is to simply make this a definition. The second 
approach is not at all unreasonable since we will show that the coordinate function 
induces a natural one to one correspondence between TU and U x R™. This is 
reminiscent of our definition of the vector space structure on T;,. 

The second approach above (the “just make it a definition” approach) has many 
advantages. The first is that it gives reasonable answers and that it is easier than 
the first approach. Another advantage is that many structures get defined on 
differentiable manifolds and they are usually defined patch by patch. The definition 
usually starts by declaring that the structure restricted to any single coordinate 
patch is a product. Often this is justified by the fact that the coordinate function 
induces a natural one to one correspondence between the structure over the patch 
and the appropriate product. It might be considered a precedent that if it is proven 
laboriously that the tangent space over a coordinate patch should be a product, 
then it should be proven that all other structures are products over coordinate 
patches. We will take the point of view that once it is shown that tangent spaces 
should be products over coordinate patches, then it will be reasonable to accept as 
given that other structures defined in the future should be products over coordinate 
patches. 

We will divide our discussion of the tangent space into two parts. In this sec- 
tion we will assume that the tangent space over a coordinate patch is a product. 
(Actually, we will make it look rather reasonable because of the one to one corre- 
spondence.) In later sections we will justify this. 

Now let M be a C” m-manifold, and let (U,@) be a coordinate chart for M. 
We define 


TM= U T, 
2eM 
and 
TES ere 
xrEeU 


Note that these are disjoint unions since each T, consists of classes of curves that 
are required (among other things) to carry 0 to x. Thus T, and T, have nothing 
in common unless xz = y. 

We have a function « : TM — M which takes each vector v in TM to the 
unique x € M for which v € T,. Note that this can be thought of as evaluation at 
0. Again, this because T), consists of classes of curves into M which carry 0 to z. 

We now consider the coordinate chart (U,@). Let U'=@(U) CR™. 

Recall the isomorphism 6 : T,, > R™ for each u € U defined by 6[f] = (@0f)'(0). 
This is imperfect notation since it is a different isomorphism for each u € U. We 
recycle this notation to give a function 6:TU +R” defined by exactly the same 
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formula 6[f] = (90 f)'(0). It is an isomorphism when restricted to a single T,,, u € 
U. We also invent a function 6: TU + R™ defined by 6[f] = 0(x[f]) = (80 f)(0). 
The last is well defined since all f in a class are required to take 0 to the same 
point. 

Define a function 9: TU 3 U' x R™ by 


Av) = (60), 4(v)). 


The function 9 is a one to one correspondence. To show one to one, we note that 
if v and w come from different T, and Ty, then 6(v) 4 6(w) since @ is one to 
one. If v and w come from one T, but v # w, then 6(v) 4 6(w) because 6 is 
an isomorphism when restricted to T,. The fuction is onto because 0: U > U' is 
onto and each T,, x € U is carried onto {6(x)} x R™ by 4. 

We now declare the one to one correspondence 8 between TU and U' x R™ to 
be a homeomorphism by setting the open sets in TU to be the images under 6! 
of the open sets in U’ x R™. Since U! x R™ is an open subset of R?™”, we have 
ourselves a coordinate chart for TM. Since the domains of the coordinate charts 
of M cover M, the coordinate charts that we have just defined cover TM. As 
mentioned in Section 1, this determines the topology on TM. We must check that 
the overlap maps are well behaved. 

Note that TUNTV 4 0 if and only if UNV #9. In fact, TUNTV =T(UNV). 
Assume that (U,@) and (V,¢) are coordinate charts with UNV 4 @. Consider the 
homeomorphisms 

6:TU +AU) x R™, 
¢@:TV > d(V) x R™ 
and the restrictions to which we give the same names 


6:TUUNV) 3 AUNV) xR”, 
@:TUNV) 3 UNV) x R™. 
We now must consider 
(600!) : 0(UUNV) x R™ > d(UNV) x R™ 


as an overlap map. We first identify what is going on in each coordinate. 
On the first coordinate, we are looking at a map that takes 6(v) to ¢(v). But 
O(v) is just A((v)) or O(x) where v € T,. This is carried to 


Thus the action on the first coordinate is just that of (¢06~') or the overlap map 
between the charts (U,8) and (V,¢). 
On the second coordinate, there is no subtlety. The map takes 6(v) to 


o(v) = (08 *)(A(v)) 
and the action on the second coordinate is that of (o ° 6-1), 

The action on the second coordinate can be reinterpreted with the aid of Lemma 
3.3. In the setting of that lemma, let the map f be the identity. With this 
assumption, the lemma is discussing the identity map expressed in local coordinates 
under two different coordinate functions. This expression in local coordinates is 
just the overlap map. The conclusion of the lemma (the outer square) is that 
the derivative of the overlap map is the composition (b ° 6-!), Of course this 
notation suppresses the fact that these derivatives are taken at specific points. 
More accurately, the map from {@(z)} x R™ to {¢(z)} x R™ is the derivative of 
the overlap map (¢06~') at @(x). 

We now prepare ourselves to forget that we are looking at maps developed from 
an overlap map of M and use h to denote (606-1). Let U'’ = @(UNV) and let 
V'=@(UNV). Our analysis above says that we are looking at a map 


h:U'xR™ 3V'x R™ 
that takes (u,v) to (h(u), Dhy(v)). We will analyze the differentiability of this 
map by representing it as a composition of several maps. 
2 

Our discussion in Section 4 gives us a map A: U' > R”™ that takes u to the 
matrix representation of Dh,,. By definition of class, this map is of class C’—! if 
h is of class C”. If 4 represents the identity on U’, then we get the map 

(i, A):U' 3U'xR™ 
which is of class C’~! by Lemma 2.5. If j represents the identity on R™, then we 
have the map 
2 
(G,A) x 7):U'x R™ 3 U'x R™ x R™ 
which is also of class C’~! by Lemma 2.5. We have amap B: R”™ xR" >R” 
which takes (Q,v) to Qu where Q is regarded as an m x m matrix and v € R™ is 
regarded as a column vector. The formulas for matrix multiplication are infinitely 
differentiable, so B is C™. Now we have that 
(hx B):U'x R™ x R™ 3 V'x R™ 
is C” by Lemma 2.5. Now we have 
h = ((i, A) x j) 0(h x B) 

which is C’!. (This argument was shown to me by Erik Pedersen who said that 
the right approach to exercises of this type is to represent the map being analyzed 


as the longest possible combination of simpler maps.) 
We have shown 
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THEOREM 6.1. If M is aC” manifold, then TM is a C’—! manifold. 


We finish this section with a few statments about the tangent space of M. 

The space TM is an example of a vector bundle. Thus it is often called the 
tangent bundle over M to distinguish it from the individual spaces T, which are 
the tangent spaces over the individual « € M. A vector bundle over a space is a 
structure over the space that includes a cover of the space and a collection of charts 
of the vector bundle that are made of products of the elements of the cover with a 
fixed vector space. A careful discussion then has to take place about overlap maps. 
We will not go into this. 

We have the map z: TM — M which takes each v to the x for which v € T,. 
A section for 7 or a section of the tangent bundle isa map 0: M > TM which 
satisfies (70 0)(x) = a for all x € M. In words, each « is carried to vector in T;. 
Recall that maps are continuous, so that we have a continuous choice of a vector 
at x that is tangent to M at «. Another name for a section of the tangent bundle 
is a vector fieldon M. 

Note that each T, has a zero vector. If o : M — TM is a vector field, then it is 
a non-zero vector field if no o(z) is the zero vector. We have shown previously 


THEOREM 6.2. There is no non-zero vector field on S?. 


Note that if TM has the structure M x R”™, then there is a non-zero vector 
field. Take your favorite non-zero vector v in R™ and let o(x) =v forallx eM. 
We thus have 


COROLLARY 6.2.1. The structure of TS? is not that of S? x R?. 


7. The Inverse Function Theorem. 


In this section we present the first of several theorems that derive information 
from the derivative of a function. The idea behind such theorems is that if the 
derivative is such a good approximation to a function, then properties of the deriva- 
tive should be inherited to some extent by the function. The reason that this is 
useful is that the linearity of the derivative makes certain properties easy to detect 
on level of the derivative. 

The main theorem of this section, the Inverse Function Theorem, is that if a C! 
function f between manifolds has Df, a vector space isomorphism for some 2, 
then f is locally a homeomorphism on some neighborhood of «. The continuity of 
the derivative is vital in reaching a conclusion about a neighborhood of «. 

There are other features of this section. The first theorem that one learns in 
calculus that extracts information from the derivative is the Mean Value Theorem. 
The importance of this theorem cannot be overemphasized. One of the steps of the 
proof of the Inverse Function Theorem is to develop a version of the Mean Value 
Theorem in higher dimensions. 

Another feature of this section is to introduce the phrase “by local change of 
coordinates, we can assume ... ” to the reader. This will occur several times, 
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once as a consequence of the Inverse Function Theorem that we give as a corollary. 
Instead of trying to make a general lemma that states when this phrase can be 
invoked, we just give the examples to show how and when it is done. 

A third feature of this section is that we avoid partial derivatives to a degree 
verging on paranoia. Our arguments lie somewhere between the specificity of direct 
coordinate calculations and the generality of proving these theorems on Banach 
spaces. (This last can be done, and is done in several texts.) 

Lastly, this section unrolls the proof of the main theorem very slowly. Various 
intermediate results (such as the Mean Value Theorem) are stated and proven in 
the middle of the proof of the main theorem. To prove a homeomorphism, one 
must prove that a function is both one to one and onto. The proofs of these two 
parts are quite separate and are done in with a large interruption in between to 
introduce needed lemmas. 

We start by stating the main theorem and giving a corollary. The theorem 
guarantees the existence of a homeomorphism and has something to say about the 
derivative of the inverse. 


THEOREM 7.1 (INVERSE FUNCTION THEOREM). Let f : M— N be aC” func- 
tion, r > 1, between manifolds, and assume that Df, is an isomorphism for some 
zx € M. Then there is an open set U about x so that V = f(U) is open in 
N, so that f\y is a homeomorphism onto V and so that (f\y)~'is C” and if 


(flu)~* (2) =a, then D((flu)~"), = (Diy a 


COROLLARY 7.1.1. Let f, M, N and x be as in the theorem above with M and 
N of class C". Then there is an expression h of f in local coordinates so that h 
is the identity function from a Euclidean space to itself. 


PROOF OF COROLLARY: Assume that M is an m-manifold. Since Df, is an 
isomorphism, the dimension of Ty.) is m and N is an m-manifold. Assume the 
conclusion of the Inverse Function Theorem with the notation as in the statement. 
By the discussion in Section 5, we can find a coordinate chart (U,,0) with U; CU 
in which @ is a homeomorphism onto R™ and so that f(U;) is contained in the 
domain of a chart (V|,¢) for N. Thus, the expression h, of f in these coordinates 
takes R™ to an open subset W of R™. We know that h; and (h,)~! are C”. 
Let W = f(U,) and let ¢ = (hi)! o (d/w). Now (W,¢) is is a valid coordinate 
chart for N and the expression of f using coordinates (U;,4) and (W,¢) is the 
identity from R” to itself. 


In the presence of the hypotheses of the Inverse Function Theorem, the corollary 
above is usually invoked with the words “by the Inverse Function Theorem we can 
assume that the function is just the identity on R™ in local coordinates.” 

We will start the proof of the Inverse Function Theorem be first showing that 
there is a neighborhood of z on which f is one to one. The main tool will be a 
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technique that controls how much points move under various maps. The main tool 
for the control will be a Mean Value Theorem. We will start with that. 


THEOREM 7.2 (MEAN VALUE THEOREM). Let f : R™ — R” be C! and let 
a,b € R™. Assume that ||Df,|| < K for some real K > 0 and for all x on the 
straight line from a to b. Then ||f(b) — f(a)|| < K||b- al]. 


PrRooF: Let x be on the line L from a to b and let € be greater than 0. Consider 
h small enough to make the following true: 


I|F(a +h) — F(a)|l — [|Dfe(A)ll < Ife +h) — fe) — Dfe(h)|l < ellAll 
For such an fh, 
F(a +h) — fall < [|Dfe(h)Il + €llAll 
< ||D fell |All + ellAll 
<(K +.)|IAll. 


Now each x € L has a 6, > 0 so that the above holds whenever h is within 6, 
of « and we get an open cover of LZ. Pick a Lebesgue number 7 for this cover and 
divide L into intervals of length less than 7. Let the endpoints of the intervals be 
A= Xo < @1,°°* <2, =b. Now 


IF) -—F@I < So F@d) - f@-r)ll 


< (K+6)>) [lei - 2-11 
= (K +6)|/b- all. 


This can be done for any € > 0 so the statement of the theorem holds. 


PROOF OF THE INVERSE FUNCTION THEOREM: INJECTIVITY: Since Df, is a 
linear isomorphism, the dimension of the domain and range are the same. Let this 
common dimension be m. 

We now argue a reduction. We wish to replace the hypothesis of the Inverse 
Function Theorem by one which assumes more about f than is given in the state- 
ment. This will be another argument about simplifications that can be made with 
local change of coordinates. 

Consider an expression of f in local coordinates. We can call it h now, but we 
will make improvements on it and still call it h. This is a function from an open 
set in R”™ to R”™ and it carries the image of x under one coordinate map to the 
image of f(z) under another. By composing the first coordinate function with a 
translation we can assume that the image of x under the first coordinate function 
is the origin. By composing the other coordinate function with a translation, we 
can assume that the image of f(x) under the second coordinate function is also 
the origin. Now we have that the expression h takes the origin to the origin, and 
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that Dho is a linear isomorphism from R™ to R™. We can compose the second 
coordinate function with the inverse of this linear isomorphism and we have a new 
expression h of f so that it carries the origin to the origin and so that Dho is the 
identity. If the Inverse Function Theorem is proven for h, then it will be true for 
the f given in the statment. 

We thus invoke the magic words “by a local change of coordinates ... ” and we 
assume that f is a function from an open set U,; in R™ to R™ that takes 0 to 0 
and which has Df as the identity from R™ to R™. 

We now wish to show that there is a neighborhood of 0 on which f is one to 
one. This will follow immediately if we show that for all x,y in some neighborhood 
of 0, we have 


” 


1 
(9) f(z) — FO) 2 sll — yl 
To get this kind of inequality that says that f does not contract much, we apply 
a tranformation that reduces our task to showing that another function does not 


expand much. Consider the function g(x) = x — f(z). Assume we can show that 
in some neighborhood of 0 every x and y in this neighborhood satisfies 


(10) lo(2) — gl < 5 lle — al 


So 


Vv 


llg(z) — gY)Il 


IIe — 9) — (F(@) — FDI 
lz —yll — IFC) — FDI. 


1 
slic — ul 


IV 


Thus we get (9). 

Our task is now to show (10). This is now in a form that can be handled by the 
Mean Value Theorem. We will be done by the Mean Value Theorem if we can show 
that ||Dgz|| < 1/2 for all x in some neighborhood of the origin. Since f is C”, so 
is g. We know Dfo is the identity, so Dgo = D(x — f(x))o = 0. We now need a 
continuity argument. 

Because Dg is continuous, we have a continuous map (which we can call Dg) 


from U,, the domain of g, to R”™ which we identify with the space of linear maps 
from R”™ to itself. It takes u € U; to Dg,. We have 


D, x1 
—_—_— 


U, x R™ R”™ xRr_# ,R™ 


where p represents matrix multiplication. The composition is continuous. The 
composition takes (x,v) to Dgz(v). 
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We now use this to estimate ||Dg,|| for values of z near 0. We know Dgpo is 
the zero map and ||Dgo|| = 0. That is, the image of the unit ball B in R”™ is 
the point 0 in R™ under Dgo. By the continuity of uo (Dg x 1) each (z,v) in 
({0} x B) CU, x R™ has a 6(2,,) so that (y,w) within 6(,,,) of (x,v) implies that 
Dgq,(w) is withing 1/2 of 0. This gives an open cover of ({0} x B) with Lebesgue 
number 7. Now for x within 7 of 0, we have Dg,(B) within 1/2 of 0. Thus for 
x within 7 of 0, we have ||Dg,|| < 1/2. 

Combining this with our observations above, we have that f is one to one on the 
open ball E of radius 7 around 0. 


Before we start work on the proof that f is surjective onto some open set in R™ 
that contains 0, we need some preliminaries. As a start, it becomes important at 
this point to mention that we are using the Euclidean metric on R”™. That is, the 
square root of the sum of the squares of the differences of the coordinates. We use 
p to denote this metric. The property that we need from this metric is that straight 
lines give the shortest distances betweeen points. We only need this in the form of 
a strict triangle inequality for non-degenerate triangles which can be deduced from 
the law of cosines. It is used in the next chain of lemmas. 


LEMMA 7.3. Let ABC be an isosceles triangle in R™ with p(A, B) = p(A,C) and 
B#C. Let D bea point in the interior of p(A,B). Then p(D,C) > p(D,B). 


PrRooF: If false, then the non-degenerate triangle ADC violates the strict triangle 
inequality by having p(A, D) + p(D,C) no greater than p(A,C). 


LEMMA 7.4. Let B be a closed, round ball in R™ and let y be a point in the 
interior of B that is not the center. Let z be the point on the boundary of B that 
is the intersection of a ray from the center of B through y. Then, for any point x 
in R™ minus the interior of B, p(x,y) > p(y,2). 


ProorF: If x is on the boundary of B, then x, z and the center of B form an 
isosceles triangle with y in the interior of one of the equal legs. The result follows 
from the previous lemma. If x is not on the boundary of B, then the straight line 
segment from y to x must hit the boundary of B in a point w interior to the 
segment and w will be closer to y than x. But now w is farther from y than z 
unless w= z. 


LEMMA 7.5. Let B be a closed round ball in R™ and let z be a point on the 
boundary of B. Let U be an open subset of R” and let f : R” + R™ be C! 
taking a point x to z. Assume that the image of f misses the interior of B. Then 
Df, is not a surjection. 


ProoFr: By applying a translation, we may assume that z is the origin. Let v 
be the center of B. We will show that the image of Df, does not contain v. 
Since Df, is linear, this is equivalent to showing that Df, hits no multiple of 
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v. Assume that uv is in the image. Then for some h € R” we have Df,(h) isa 
positive multiple of v. For real ¢ > 0, consider 


(11) I f(a + th) — f(a) — Dfe(th)I). 


For small values of t, the vector Df,(th) is parallel to v but shorter. Thus it 
represents a point y in the interior of B that is not the center and, by the previous 
lemma, z is the point not in the interior of B that is closest to y. Now f(x) =z 
which is the origin, so (11) reduces to || f(z + th) — y||. Since the hypothesis says 
that f(z + th) is not in the interior of B, we know, from the previous lemma, that 
lull < || f(~ + th) — y|| which restates as 


|Dfe(th)l| < |[f(@e + th) — f(x) — Dfa(th)I)- 


But for any € > 0, suitably small values of t > 0 make the right side is less than 
e||th||. Linearity of Df, gives t||Df.(h)|| < tel|Al| or ||Df.(h)|| < elAl|. Since 
this is true for any « > 0, we must have Df,(h) = 0. But now no multiple of 
Dfx(h) equals v. 

PROOF OF THE INVERSE FUNCTION THEOREM: SURJECTIVITY: We assume that 
we work in the open ball E about 0 on which f is one to one. Let B be the closed 
ball about 0 of radius half that of E. We know that f takes 0 to 0 and is one 
to one on B. Thus no point of S, the boundary of B, is taken to 0. Since S is 
compact, there is a minimum distance 6 from 0 to f(.S). Let B’ be the ball about 
0 of radius 6/3. We claim that B' is in the image of B. Let y be a point in B’. 
If y is not in the image of B, then there is a minimum distance y from y to f(B) 
and there is a point x in B for which p(y, f(x)) =y. Now p(y,0) < 6/3 and 0 is 
in the image of B, so y < 6/3. Since 6 is the minimum distance from 0 to f(S), 
the triangle inequality says that the distance from y to any point in f(S) is at 
least 26/3. Thus x is not in S and is in the interior of B. 

We now have the situation of the previous lemma since f is a C” map from 
the interior of B to R™ which hits the boundary of the y ball about y but not 
the interior of that ball. Thus by the previous lemma, Df, is not surjective. 
In particular, it is not an isomorphism. This occured inside a given ball B, so 
if f is not surjective onto some open neighborhood, then it happens arbitrarily 
close to 0. Now if Df, is not an isomorphism, then its matrix representation 
has determinant 0. Thus if f is not surjective onto some open set, then there 
are points x; converging to 0 whose derivatives have determinant 0. But D fo is 
an isomorphism and has non-zero determinant. The determinant is a continuous 
function of the entries of a matrix. Since f is C!, we have a contradiction. 


We are not quite done. The statment of the theorem has something to say about 
the differentiability of the inverse function and we do not yet even know if the 
inverse is continuous. The next arguments finish the proof. 
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PROOF OF THE INVERSE FUNCTION THEOREM: CONCLUSION: We have that f is 
a continuous one to one correspondence from some open set U containing 0 to an 
open set W containing 0. By the argument just above using the continuity of Df, 
we can also assume that the neighborhood U has been picked so that Df, is an 
isomorphism for all « € U. 

Let z,w be in W and let x,y in U be such that f(z) = z and f(y) = w. 
Denote the inverse of f by fF’. From (9) we have 


2 -wll > SIF) - Few) 


|F(2)- Fw) < lle - wll 


which shows the continuity of F. 
To validate the claim in the statement of the Inverse Function Theorem about 
the derivative of DF’, we must look at 


(12) ||F(w) — F(z) — (Dfa)*w — 2)Il = lly 2 — Df)" FY) — FIL 


The expression inside the norm in (12) is obtained from the expression inside the 
norm of the next expression by applying (Df,)~!. Thus if K = ||(Df,)~+||, then 
(12) is no greater than 


(13) K||Dfe(y— 2) — Fy) + F@)Il = KF) — f@) — Dfey — «)Il. 


Now (13) can be kept less than (€/2)||y — z|| for a given « > 0 by keeping ||y — 2|| 
suitably small. We want our original (12) (which is no greater than (13)) smaller 
than ¢||w — z||. But another application of (9) gives us 


(e/2)\ly — all < elf) — F@Il = ellw — 21). 


We obtain this by controling ||y — z|| = ||F(w) — F(z)||. We want to do it by 
controlling ||w — z||. But by (9) again, ||F(w) — F(z)|| < 2||w — z|| so keeping 
||w — z|| half the size required for ||y — x|| = ||F(w) — F(z)|| will do the job. This 
shows that F' is differentiable and that its derivative is as claimed in the statement 


of the theorem. 
We now show that F is C”™. We have DF, = CDF Rai)- 3 We can regard 


z+ DF, as a composition of three functions io Df o F where i: RR” oR” 
is the operation of matrix inverse. Cramer’s rule (a formula for matrix inversion 
involving determinants) shows that i is C®. Since f is C!, the function r+ Df, 
is continuous. Thus 


(14) DF =icDfoF 
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is continuous and F is C!. But now if f is C?, then all the functions on the right 
side of (14) have continuous derivatives and F is C?. Further, the derivative of 
both sides of (14) and the chain rule give D?F as a composition involving DF, 
Di and D?f. But (14) can be used again to replace DF in the composition with 
the right side of (14) in which only F and not DF appears. Since 7 is infinitely 
differentiable, the only thing to stop this process is the limit on the differentiability 
of f. Inductively, we get that if f is C”, then so is F. 


[The proof of surjectivity above can be short circuited significantly by replacing 
the geometric argument about the derivative at the point of closest approach to a 
point in the range by a more algebraic one. The right way to measure to detect the 
closest approach is to use the square of the distance. This has the double advantage 
that the square of the distance has a simple formula that is differentiable and that 
it can be represented by a dot product. It turns out that formulas involving the 
dot product are easy to differentiate. In fact, the dot product is an example of a 
bilinear map and these are easy to differentiate. Let f : Ax B—-C be a bilinear 
map between vector spaces. That means that f(a,b; + bo) = f(a,b1) + f(a, be), 
f (a1 +a, b) = f(a1,b)+ f(ao,b), and rf(a,b) = f(ra,b) = f(a,rb). Unfortunately, 
it also means that f is not linear unless one of A or B is trivial so we cannot say 
that Df = f. Consider the inclusions 7, :A— Ax B defined by i,(u) = (u,v) 
and j, : B— Ax B defined by ju(v) = (u,v). Each is a constant plus a linear 
map. For example i,(u) = (0,v) + 49(u) and io is linear. Thus D(i,)u = io for 
all wu and v, and D(ju)» = jo for all u and v. Now the compositions (f o7,) and 
(f °ju) are basically the restrictions of f to A x {v} and to {u} x B respectively 
and are also linear (since f is bilinear) and are their own derivatives. 

This observation and the chain rule give 


(f 7 dy) = Df oO dy )u 
= (Dfi,(u)) ° 40 
a (Difco o to), 


and 
ei © ju) = Df Re 
_ (DF) so Jo 
i (Dfiu,v) © Jo). 


These can be applied to a € A and b € B as appropriate to give 
(f © ty )(@) an CD fae) 2 to) (a), or 
f(a, v) = D fcu,v) (a, 0), 


and 
(f © ju)(b) = (Df u,v) © Jo)(b), or 


flu, b) => DF ass) (0, b). 
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Since D fiy,,) is a linear map, we have 


D fiuv) (a, b) = f(a,v) ar f(u,b). 


We can now apply this to dot products. Consider d : R™ x R™ — R where 
d(u,v) is the dot product of u and v. This is bilinear so the above applies. Consider 
f:X 3 R™ andg:Y > R”. We have (f-g) =do(f xg). Now D(f-g) = 
Ddo (Df x Dg). More specifically 


D(f : 9) (x,y) (4, 6) = Dd(¢ (x) ,9(y) )\2 (Df x Dgy)(a, b) 
= Dd(5(2),9(y)) (D fx(@), Dgy (6) 
= f(x) -Dgy(b) + g{y)-Dfe(a). 


This is often referred to as a product formula. 

Going back to the proof of surjectivity, it is now possible to use this to show 
that if x has f(a) the closest point to y, then all vectors in the image of Df, are 
perpendicular to the vector from f(z) to y.] 


8. The C” category and diffeomorphisms. 


There is a category whose objects are C” manifolds and whose morphisms are 
C” functions. The categorical isomorphisms are called C" diffeomorphisms. They 
are the morphisms in the category that have inverses in the category. This is a 
stronger requirement than just requiring that the morphism have an inverse as a 
function. 

Consider the function f(z) = x? from R to R. The function f is C® and isa 
homeomorphism. However it is not even a C! diffeomorphism since its inverse has 
no derivative at 0. However it is a consequence of the Inverse Function Theorem 
that if f isa C” homeomorphism (that is, a homeomorphism that happens to be 
C”) and Df, is non-singular for each x, then f is a C” diffeomorphism. Note 
how this does not apply to f(z) = 2°. 

Two diffeomorphic manifolds “behave the same” with respect questions about 
differential maps. Every diffeomorphism is a homeomorphism so diffeomorphic 
manifolds are homeomorphic. The converse is not true. There are eight manifolds 
that are not C’® diffeomorphic, but they are all homeomorphic to S$”. There is an 
uncountable collection of manifolds, no two of which are C® diffeomorphic, but 
which are all homeomorphic to R*. The class of differentiability is uninteresting 
in these questions once C! is reached. The following is one version of this. 


THEOREM 8.1. 


(1) Let 1 <r < oo. Every C" manifold is C" diffeomorphic to a C®° manifold. 
(2) Let 1<r<s<oo. If two C*® manifolds are C” diffeomorphic, then they 
are C* diffeomorphic. 
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The above theorem can be found in Differential topology by Morris W. Hirsch, 
Page 52. 

Consider f : M — N a C”™map between C” manifolds. Let the dimensions of 
M and N be m and n respectively. We have that Df, : T, — Ty (2) is a linear 
map. This allows us to define Tf : TM > TN by Tf(v) = Dfray(v) € Tha). 
This gives a nice well defined function, but it tells us little about how it cooperates 
with the structures on TM and TN as OC"! manifolds. If (U,@) is a chart 
with « € U and (V,¢) is a chart with f(z) € V, then we can express f in 
local coordinates as h = ¢0 f 06~!. We also get coordinate charts (TU,@) and 
(TV,¢) for TM and TN that contain the relevant points. The images of these 
coordinate functions are 6(U) x R™ and ¢(V) x R” respectively. The expression 
of Tf in these local coordinates from 6(U) x R™ to ¢(V) x R” takes (6(z),v) 
to (¢(x),(¢ 0 Df, 0 ~!)(v)) which by Lemma 3.3 means that (p,v) is taken to 
(h(p), Dhy(v)). As discussed in Section 6, this is a C’~! map. Since Tf behaves 
functorially on each T, and it carries each T, into Ty(z), it is easy to show that 
Tf behaves functorially in general. Specifically, T( fog) =TfoTg and if f is the 
identity on M, then Tf is the identity on TM. We thus have 


THEOREM 8.2. The operator T is a functor from the category of C’ manifolds 
and C” maps, r > 1, to the category of C’~! manifolds and C™~! maps. 


9. Vector fields and flows. 


This section is about differential equations and their solutions. Rather than start 
this section with a diffential equation and look for a solution, we look at a function 
and see what differential equation it solves. Then we can discuss general differential 
equations and their solutions. 

Let f : R — M bea C! function into a C” manifold. We regard R as a 
C! manifold and we assume a C! differential structure on it that contains the 
coordinate chart (R,7) where i is the identity map from R to itself. 

Since i: R > R is the identity map, [i] represents an element of Ty C TR. 
Note that 0 (the additive identity) in the vector space To is [0], the class of the 
constant map taking all of R to 0. This is because the isomorphism 1:To > Rof 
Lemma 1.1 has 2[0] = (i 0 0)'(0) = 0. We also have i[i] = (40 7)'(0) =1 so [i] 40 
in Tp. (Because ii] = (i 07)!(0) = 1, we could try to identify [i] with 1 in Tp, but 
this is dependent on our choice of coordinate function and we will content ourselves 
with the fact that [¢] is not 0 in To.) 

From the definition of tangent spaces, [f] is an element of Tyo). We have 
D fo[t] = [fo2] = [f]. We thus have an interpretation of the vector that f represents 
at f(0). 

It should also be possible for f to represent vectors at other points of its image. 
Note that [jf] is the set of curves that take 0 to f(0) and that have derivatives 
at 0 the same as f’(0) (as measured in any coordinate chart). It is reasonable to 


31 


define, for any t € R, that f represents a vector at f(t) which is the class of curves 
that take 0 to f(t) and that have the same derivatives at 0 as f’(t) (as measured 
in any coordinate chart) so we make this a definition. Note that one curve in this 
class is the curve defined by fi(z) = f(v+t) = (fo)(x) (where 6,(z) =x +t is 
the translation of R that takes 0 to t) since f,(0) = f(#) and f{(0) = f’(t). Also 
note that Df;,[i] = (Df o D@)[i] = Df(D4]i]) where D@,[2] is an element of T; in 
TR. Thus we are using the translations to give preferred isomorphisms from To to 
the various T, in TR. We can use [f;] as the tangent to the curve f at f(t) and, 
tempting danger, we recycle the prime notation for derivative and let f'(t) denote 
this tangent [f,]. [Note also that [@,] €¢ T; C TR since 6(0) =¢. Thus Dfi[@] 
makes sense and D/f,[6:] = [f o 6] = [f:] = f’(t) in our new notation, so we have 
another view of f’(t).] 

From the above discussion, a curve f : R > M defines aset of vectors f'(t) = [f:] 
that are tangent to the curve at the various points of its image. These tangents give 
derivative information about the curve at each of its points. A differential equation 
will go the other way. We will start with vectors and try to find curves that the 
vectors are tangent to. 

One way to start with vectors is to start with a vector field. In deference to 
customary notation, we will usually use capital letters from the end of the Roman 
alphabet to denote vector fields. Thus, let X : M— TM be a vector field. 
Specifically, X is a section of the tangent bundle. A curve f : R — M is an 
integral curve for X, if for each t € R we have f'(t) = X(f(t)). If ¢ © M, then 
we say that the integral curve starts at x if « = f(0). An initial value problem 
is a vector field X on M anda point x € M. A solution of the initial value 
problem is an integral curve for X starting at x. We will relate the solutions 
of initial value problems with the standard existence and uniqueness theorems for 
differential equations of functions of a real variable. 

The following was proven in class in the Fall semester. 


THEOREM 9.1. Let f(t,2) be a function of two real variables defined on some open 
set U of R?. Assume that f is continuous, and that (to,%0) is given in U. Then 
there is an open interval J in R containing tg and a C! function @: J — R so 
that ¢(to) = 2 and so that for all t € J, (t,@(#)) isin U and ¢'(t) = f(t, d(0)). 
Further, if f satisfies a Lipschitz condition with respect to the second variable, and 
6: K — R for an open interval K C J satisfies all the same requirements as ¢, 
then 9 = d|x. 


This is the standard theorem that guarantees for each initial value problem 
(15) = FUG), (to) = Xo, 


there exists locally a unique solution. We must make a comment about the solu- 
tions. Consider x(t) = tan(t). This cannot be defined continuously on any open 
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interval containing +72/2. Thus the maximal open interval continaing 0 that this 
function can be defined on is (—7/2,7/2). Note that 2'(t) = sec?(¢) = 1+tan?(t) = 
1+.27(t) so that x satisfies the initial value problem 


g=1+z2', x(0) = 0. 


Thus it may be impossible for the solutions guaranteed in Theorem 9.1 to be defined 
on all of R. This will have some effect later in this section. We will mention later 
how this is sometimes prevented. 

We would like to apply a theorem like Theorem 9.1 to a manifold setting. We 
will comment on some aspects of this theorem that need modification before we 
make the application. 

Theorem 9.1 has the derivative conditions given by f varying with both time and 
position. This is reflected in the notation f(¢,z). The setting to which we would 
like to apply the theorem has a fixed vector field which gives derivative (tangent) 
conditions at each point, but which does not depend on time (does not depend 
on the time of arrival of the curve). Extracting less information from Theorem 
9.1 is no problem. We can restrict ourselves to time independent systems (the 
adjective is autonomous) which we disguise as time dependent ones by taking an 
autonomous f(z) and rewriting it as an apparently time dependent F(t,x) defined 
by F(t,z) = f(z). At this point we can apply standard existence and uniqueness 
theorems as if time were a factor. Note that autonomous systems are ones where 
the function giving the derivative information does not depend on time, however 
the parameter for any solution is still time. Thus z' = f(z) still has z as a function 
of ¢ and a’ still means dz/dt. 

[If the entire theory were developed for autonomous systems, then the theory 
for time dependent systems could actually be recovered. Given a time dependent 
system, we can regard it as an autonomous system on a domain that has one more 
dimension than the original. The derivative information in the new system will 
have vector components the same as they were in the original dimensions and vector 
component 1 in the new dimension (which may as well be regarded as the time 
dimension). This will force solution curves to move along in the extra dimension 
at unit speed and thus pass through points in the other dimensions with the right 
derivative information for each time t.] 

The result of the previous two paragraphs’ discussion is that vector fields and 
differential equations will be assumed autonomous. 

The next modification is to introduce extra space dimensions into the theorem. 
We can use the same notation (taking into account the removal of the dependence 
on time) and write problems as z' = f(x). However, we now regard x as an 
element of R™ instead of R and the derivative x’ will be also be an element of 
R™. Thus f(z) has to be an element of R™ and f is a function from R™ to 
R”™. This change turns out to be very minor. The proof of Theorem 9.1 from last 
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semester goes through almost without change to prove a version of Theorem 9.1 in 
dimensions above 1. 

At this point we can sketch how a modified version of Theorem 9.1 can be applied 
to vector fields on a manifold. Let X : M— TM be a vector field on a C” m- 
manifold M. If we wish some uniqueness in our discussion (and we do), we will 
need a Lipschitz condition at the appropriate place. One easy way to get a Lipschitz 
condition for a function is to assume that it is differentiable. This follows from the 
Mean Value Theorem (exercise). The Lipshitz condition is to be applied to the 
function giving the derivative information as a function of the spatial coordinates. 
In our setting this is the vector field X. Thus, we want to assume that X is C!. 
This means that TM must have at least a C! structure. From Section 6, we know 
that M must have at least a C? structure. We thus assume that r > 2. 

Let (U,¢) be a coordinate chart for MM. We have available the homeomorphism 
¢:TU > C(U) x R™ where C(v) = (€(v), C(v)). We can set up an autonomous 
differential equation «’ = ¢(X(¢~!(a))) on C(U). Let ¢ be a solution satisfying 
an initial condition ¢(0) = zo € ¢(U). Consider f = ¢~1(¢) as acurve in M. We 
have f'(t) = [fi] = [fo 4] where 4, is translation by ¢. But [f o 6] is understood 
by looking at its image under ¢ . Namely, at the derivative of €o fo; at 0. This 


($0 4:)'(0) = g(t) 
= O(X(6-(9(#)))) 
= ((X((#))- 


But this just says that the image under ¢ of f'(t) is just the image of X(f(t)) under 

¢. Thus f/(t) = X(f(t)) and f is an integral curve for X. It starts at C~1(d(0)) = 

¢~!(a9). It is an exercise to show that another coordinate chart containing ¢~1 (x9) 

gives an integral curve starting there that must agree on overlapping parts of the 

domains. The exercise would use the overlap maps to relate one solution to the 

other and then quote uniqueness to show that they must agree as maps into M. 
The above sketch gives support to the following. 


THEOREM 9.2. Let M be aC” manifold with r >2. Let X bea C* vector field 
on M with s >1. Then for any x € M, there is a unique integral curve for X 
that starts at x and that is defined on some open interval in R. containing 0. 


We want more. This will require another modification to the existence and 
uniqueness theorems above. Because of the techniques that allow results on Eu- 
clidean spaces to be applied to manifolds and vice versa, we will not distinguish 
much from now on between Theorems 9.1 and 9.2. 

The last modification is far from minor. We introduce a new concept to discuss 
it. Let 6: J > M, be acurve where J is an open interval in R. Assume for the 
moment that ¢ is one to one. We can talk about a flow that is defined along the 
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image of the curve. The flow will involve a motion of the points on the image of the 
curve. If « = (to) then we can define ®,(x) = d(to + t). Note that ®o(z) =z. 
We can think of ®, as a function that pushes points ¢ units along the curve with 
t measured in the domain of ¢. We have to be careful if J is not all of R. If this 
is the case, then ®, is only defined on those x with a to € J for which ¢(to) = x 
and to +t € J. The domain of a given ®, can easily turn out to be empty. We 
have actually defined a family of functions and we will refer to the entire family as 
a flow. One relation that the maps ©, satisfy, for any x in the image of @, is 


(@; 0 ®.)(x) = O:(4(s + to)) 
= o(t + s + to) 
= $54:() 


using the fact that x in the image of ¢ has a unique to satisfying z = (to). The 
above relation must be treated with care in those situations where the domain of 
¢ is not all of R. 

If @ is not one to one, then we get into potential problems of well definedness. 
These problems go away if the curve is an integral for an autonomous system for 
which uniqueness holds. 

Now assume that @ is an integral curve for a vector field X in that ¢/(t) = 
X(@(t)). (It will be very important for what we want to say that we are in the 
autonomous case.) Assume that ¢ is not one to one and assume that the differen- 
tial equation satifies hypotheses that make solutions to the initial value problems 
unique. Let zo = ¢(to) = O(t1) with to 4 t1. Now ¢(¢) is a solution to the initial 
value problem 

xv’ = X(z), x(to) = Xo. 
Consider 
bi(t) = b(t + (ti — to)) 
= (o 2 94, —ty)(#) 


where 6,4, is translation in R by t; — to. We have 


1 (t) = 6' (9t, -t0 (t)) 

o' (t + (t1 — to)) 
X ($(t + (t1 — to))) 
X (¢1(t)) 


and 
¢1(to) = o(t1) = Xo 


so ¢, is also a solution to the same initial value problem. Thus by uniqueness 
o, = ¢ and for all t, d(t) = o(t + (ti —to)). This makes ¢ periodic. It also makes 
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the flow well defined. If ¢(t)) = (1) = @ then ®,(x) written as ¢(t + to) or 
o(t + #1) = (t+ to + (ti — to)) specifies only one point. 

We claim that there are two possibilities in the above situation (non-injective 
integral curve for autonomous system) — either ¢ is a constant map or there is a 
6 > 0 so that 


(16) o(t +4) = o(t) 


for all ¢ and 6 is the minimum positive real for which (16) holds. If (16) holds for a 
given 6, then ¢(t+ 76) = $(¢) for all n € Z. If there are arbitrarily small, positive 
6 for which (16) holds, then the set of points in R which map to ¢(¢) is dense in 
R. But this is the set ¢ 1(t) which must be closed and therefor all of R. Note 
that a flow using a constant curve makes sense. It is just the constant flow. 

Now we note that the existence and uniqeness theorem guarantees solution curves 
through all points in M. Thus we can define a flow at every point in M. Specif- 
ically, ®:(z) = ¢(t + to) where ¢ is a solution curve that passes through x, and 
to is a real number for which ¢(to) = x. The collection of the ©, will be called a 
flow on M determined by X. Since 6,0, = 6:4, holds at each point, it holds 
in general (whenver the composition makes sense). We can prove more. 

Suppose ©;(z) = ®;(y) = z. This means that the integral curve passing through 
x and the integral curve passing through y meet at z. Say ¢1(t1) = 2, do(te) =y 
and ¢i(t3) = do(ts) = z. Now 3(t) = do(t + (t4 — t3)) solves the same initial 
value problem as ¢; (repeat the analysis several paragrpahs above), so ¢3 = ¢) 
and ¢1(t) = do(t + (ta — t3)). So x = di(ti) = do(ti + (44 — ts)). Now 2 = 
®,(z) = do(t +t, + (t4 —tg)) and z = ®,(y) = do(f+ te). Thus ¢o is periodic and 
a(t) = b2(t+ (ti — ta) + (t4—ts)) for all t. But y = d2(t2) = d2o(ti+(ta—ts)) = @. 
We have shown that each ®; is one to one. 

Showing that ®, is onto requires an assumption. We now assume that the do- 
mains of each integral curve is all of R. Let x be in the domain of the system. 
Then ®_, is defined as well as ®,. We have ®, 0 ®_; = ®o which is the identity. 
Thus x = ®,(®_,(z)) and ©; is onto. Note that consideration of 6_, also shows 
that 6, is one to one, but the paragraph above shows that ®, is one to one without 
the assumption that integral curves are defined on all of R. 

From now on, we assume that integral curves are defined on all of R. This gives 
us one to one correspondences ®;. Because of the fact that ®o is the identity 
one to one correspondence and ®,0 6, = ®,,;, we have a group of one to one 
correspondences and the function t + ®, is a homomorphism. This situation is 
almost never referred to as a one parameter family of one to one correspondences. 
There is such a thing as a one parameter family of homeomorphisms, but we don’t 
know yet that the functions 6; are homeomorphisms. It remains to discuss what 
kind of one to one correspondences the ®; are. 

The following can be proven, but will not be proven here. To simplify the stat- 
ment, we use ® to represent the flow 6; on M and regard the domain of ® to be 
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Rx M. Here ®(t,x) = ®;(z). 


THEOREM 9.3. Let M be a C+! manifold with r > 1. Let X bea C” vector 
field on M. Then the flow ® on M determined by X is C™ on its domain. In 
particular, each %; is a C’ homeomorphism from M to itself. 


Of course the above statment is limited by the fact that the integral curves for 
X may have limited domains of definition. The following gives a condition that 
avoids this problem. We will not prove it here. 


THEOREM 9.4. Let M in Theorem 9.3 be compact. Then the domain of the 
flow ® determined by the vector field X is all of R x M and each ©; is a C” 
diffeomorphism. 


10. Consequences of the Inverse Function Theorem. 


In this section we present more theorems that obtain information from the deriva- 
tive of a function. They are all based on the Inverse Function Theorem. 

To make the statements simpler we invent some notation. Let f : M— N be 
a C”™ map, r > 1, from an m-manifold to an n-manifold and let « € M. If 
(U,@) and (V,¢) are coordinate charts of M and N respectively with « € U and 
f(x) €V so that 0(z) = 0 and 6(f(z)) = 0, then we say that h = do f 067+ is 
an expression of f in local coordinates centered about x. 


THEOREM 10.1 (IMMERSION THEOREM). Let f: M— N bea C™ map, r>1, 
from an m-manifold to an n-manifold. Let Df, be a monomorphism for some 
xz € M. Then there is an expression h : R™ — R” of f in local coordinates 
centered about x for which h(x1,...,%m) = (@1,...,%m,0,...,0). 


PrRooF: As in the beginning of the proof of the Inverse Function Theorem, a local 
change of coordinates allows us to assume that f is a function from an open set U, 
in R™ into R” that takes 0 to 0 and which has Dfp : R™ — R” act by taking 
(@1,...,%m) to (#1,...,2m,0,...,0). 

Let 7: RR” ™ > R” act by taking (21,...,2n-m) to (0,...,0,%1,...,Un—m) . 
We define f : Uy x R"-™ > R” by f(u,v) = f(u) + j(v). The domains of f, f 
and 7 do not agree, but we can fix this up by introducing 7, and 72 which project 
U, x R"~™ onto its first and second factors respectively. Now we have 


f(u,v) = (fo m)(u,v) + (7 0 m2) (u,v). 
Each of 7, 7; and 7 is linear and its own derivative. We have 


Dfioo) (a, b) = D(fo 71 )(0,0) (a,b) + Dj 72)(0,0) (a, b) 
= Dfo(a) + j() 
= (a, b) 
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by our assumptions about D fo. 

By the the Inverse Function Theorem, there is an open set Uz in U; x R””™ 
containing (0,0) on which f is a C” diffeomorphism onto an open set in R”. By 
the discussion in Section 5, there is a coordinate chart (U3, ) in U2 taking U3 to 
R” in a way that takes U; NU; to R™ x {(0,...,0)}. (The functions discussed 
in Section 5 “respect” the coordinates.) Now the last few lines in the proof of the 
corollary to the Inverse Function Theorem can be duplicated. 


THEOREM 10.2 (SUBMERSION THEOREM). Let f: M— N bea C” map, r >1, 
from an m-manifold to an n-manifold. Let Df, be an epimorphism for some 
x € M. Then there is an expression h : R™ — R” of f in local coordinates 
centered about x for which h(21,...,2n,€n41,---;Em) = (£1,---,2n)- 


PrRooF: Again, a local change of coordinates allows us to assume that f is a 
function from an open set U; in R™ into R” that takes 0 to 0 and which has 
Dfo: R™ > R” act by taking (71,...,2n,2n41,---,Lm) to (M1,...,2n). 

Let 7:R”™ > R”™” take (21,...,%n,2n41,---;Lm) tO (Gn41,---,2m). Define 
f:U, +R” x R™” by setting f(u) = (f(u),7(u)). Since z is linear, we have 


D fo(a) = (Dfo(a), 7(a)) =a 


by our assumption on Dfo. The rest of the argument proceeds as in the proof of 
the Immersion Theorem. 


A function is called an immersion (submersion) at an x in its domain, if the 
Immersion (Submersion) Theorem applies to the function at x. A function is 
called an immersion (submersion) if it is an immersion (submersion) at each point 
in its domain. 

This leads to more terminology. A point in the domain of a function is a regular 
point of the function if the function is a submersion there. A point in the domain 
of a function is a critical point of the function if it is not a regular point of the 
function. A point in the range of a function is a critical value of the function if it 
is the image of a dritical point of the function. A point in the range of a function 
is a regular value of the function if it is not a critical value of the function. This 
chain of positive and negative definitions leads to conclusions that are worth getting 
used to. A point that is in the range but not the image of a function must be a 
regular value of the function since it cannot be a critical value. If f : M—- N is 
a function from an m-manifold to an n-manifold with m <n, then all points in 
M are critical points and all points in the image of f are critical values since it is 
impossible for f to be a submersion anywhere. If a function is a submersion, then 
all points in the domain are regular points and all points in the range (whether in 
the image or not) are regular values. Lastly, the image of a regular point might 
still be a critical value if it is also the image of a critical point. That is, a regular 
value has the property that no point in its preimage is a critical point. 
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The “subimmersion theorem” fails. The function « 4 2? from R to R has 
derivative at 0 that is neither one to one nor onto. There is also no expression of 
the function in local coordinates centered at 0 that is linear. It is interesting to 
see how far a combined proof of the Immersion and Submersion Theorems can be 
pushed before it fails. 

If k isa constant and zx is a vector of several components, then under some condi- 
tions a formula such as f(x) = k can define some of the coordinates as functions of 
some of the others. The Implicit Function Theorem says when and to what extent. 
The standard example of x? + y? = 1 shows that the hypotheses and conclusions 
are reasonable. 

To help with the statement of the theorem, we need a reasonable way to refer to 
a partial derivative with respect to one variable. Let f: U x V — W be given and 
let ju: V 4UxV be defined by j,,(v) = (u,v). As in the remarks at the end of 
Section 7, 7, is not linear but a constant plus a linear. It derivative is the linear 
part and we have D(ju)y = jo for any v. (We have to keep careful track of the 
meaning of the subscripts.) We define D2 fy,,) to be D(f ° ju)y = (Df (u,v) ° Jo): 


THEOREM 10.3 (IMPLICIT FUNCTION THEOREM). Let f :UxV—N bea C™ 
function, r > 1, between manifolds. Assume that D> f(y,,) is an isomorphism for 
some (u,v) and let k = f(u,v). Then there is an open set U, about u in U, 
an open set V; about v in V and a C” function g : U; + V, so that for every 
(z,y) € U, x V, we have f(x,y) =&k if and only if y = g(x). Further, if U2 C U, 
is open and connected about u, then any continuous go : Uz > V with go(u) =v 
and satisfying f(x, 90(x)) =k for every x € Uy must agree with g on Uy. 


REMARK: The function g is the function that is being “implicitly” defined by the 
equation f(u,v) =k. 


PROOF: By local change of coordinates, we can assume that U and V are open 
subsets of R™ and R” respectively, that (u,v) = (0,0), that N is R” (the 
dimension is fixed by the isomorphism Dy fio.) ), that f(0,0) = 0, and that 


D2 fo,0)(b) = D(f © jo)o(6) = (D f(o,0) © Jo) (6) = Df(o,0) (0,6) = 6. 


We now use u and v as arbitrary elements of U and V and not as reference to 
items in the statement. 


Let f:U x V > R™ x R” be defined by 


f(u,v) = (u, Fu, v)) = (r(u, v), flu, 0) 
where 7: U x V + U is projection. Now 
Dfi(o,0) (a,b) = (x(a, 6), D f(o,0) (a, b)) = (a,b). 
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So f isa C” diffeomorphism from some open set about (0,0) to an open set about 
0. Thus on some open set of the form U, x V,, we have a C” inverse h of f from 
an open set W about (0,0) € R™ x R” onto U; x V,. Every (2, y) € W has 


h(x,y) = (hi(x,y), hy (z,y)) 


where, by Lemma 2.5, both h, and hy are C”. Now 


(x,y) = f(h(x,y)) 
= f(hi(x,y),h2(x,y)) 
= (h i(z, y); f(hi(a,y), ho(x,y))), 


so hi(z,y) = 2 for all (z,y) in W. So A(z, y) = (a, ho(z,y)) and 


(zy) = f(A(,y)) 
= f(x,ha(x,y)) 
= (2, f(x, ho(z,y))). 


This gives that f(z, ho(z,y)) = 0 if and only if y = 0. Let g(x) = ho(z,0). Now 
f(x,z) = 0 if and only if z = ho(#,0) = g(x). This holds for all (z,z) € U; x Vi 
since every such (2,2) is of the form (a, ho(x,y)) for an (a,y) EW. 

Now assume Uy, is a connected, open subset of U; about 0 and assume there is 
a continuous function gg : U; — V for which has go(0) = 0 and f(z, go(x)) = 0 
for every x € U2. Consider the subset A of Up on which go = g. We know 0 € A. 
Let x be in A. By the continuity of go, there is an open U3 C U2 about xo so 
that go(U3) C Vi. But for « € U3, we have (2, go(x)) € U3 x Vi CU, x Vy and 
here f(z, go(z)) = 0 if and only if go(z) = g(x). Thus A is open in U,. Now A is 
the inverse image of 0 under the continuous g — go. Thus A is also closed in U2. 
Since U2 is connected, A is all of U2. 


11. Submanifolds. 


Let A be a subset of a C” m-manifold M. We say that A is aC" submanifold 
of M of dimension k if each point a of A lies in the domain of a chart (U,9) of 
M so that if R* C R™ is the set of points in R”™ whose last m —k coordinates 
are 0, then 

UnA=6\(R‘), 


The chart (U,@) is called a submanifold chart for A in M. Note that all the 
charts (UNA, @|una) where (U,@) is a submanifold chart for A in M define a C” 
differentiable structure for A. 

The inclusion of the submanifold A into M is an immersion. That is because a 
non-zero tangent vector in A cannot become zero in M since a coordinate function 
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to test the tangent vector in A is the restriction of a coordinate function that tests 
it in M. The inclusion is also more than that. A basic open set in A (say the 
domain of a coordinate chart) is also open in A in the subspace topology that A 
gets from M. Thus the inclusion map is open and is a homeomorphism onto A. 
That this obvious fact is worth pointing out is seen from the next two examples 
example. We give the more complicated one first. 

Let S! x S! be covered by R? in the usual way so that two points in R? project 
to the same point in S! x S! if and only if their coordinates differ by integers. Let 
L be a straight line in R? of irrational slope. It is impossible for two points on 
L to have coordinates that differ by integers, so the covering projection restricted 
to L is one to one. It is also an immersion. (Covering projections are immersions 
under the reasonable assumption that the charts of the base space and the charts 
of the covering space are chosen compatibly.) However it is not a homeomorphism 
onto its image in S' x S! and its image is not a submanifold of S! x S'. To argue 
that these statements are true, we argue that the image is dense in S! x S!. First 
we need a lemma. 


LEMMA 11.1. Let r be a positive irrational number, let x and € > 0 be real, and 
let k be a positive integer. Then there are integers m and n with |m| > k so that 
mr —n is within € of x. 


PROOF: Consider the half open interval [0,1) as representative of the real numbers 
modulo 1. Then the function from kZ to [0,1) taking km to kmr mod 1 is one 
to one since km yr —kmgr € Z implies that r is rational. Thus there are infinitely 
many different numbers in [0,1) of the form kmr — kn for integers km and kn. 
There must be two (km r — kn) < (kmor — kn) in [0,1) that differ by less than 
e. Let 6 =k(mg—my,)r—k(ng—1n1). Now 0 < 6 and 6 is smaller than both 1 and 
e. If mg = m,, then 6 is an integer and cannot be greater than O and less than 
1. Now the integral multiples of 6 divide the real line into intervals of length 6 so 
x is within 6 (which is less than ¢€) of at least two consecutive integral multiples 
of 6. We can thus choose one integral multiple of 6 that is not 0 and is within ¢ 
of «. We now have that x is within « of a number of the form kpr — kq where p 
and q are integers and p is not 0. This completes the lemma. 


Now back to the line L in R? of irrational slope r. Let its equation be y = ra+c. 
The distance from a point (a,b) in R® to L is no more than b—(ra+c) since this is 
the vertical distance from L to (a,b). If m and n are integers, then (a+m,b+n) 
projects to the same point in S! x S' as (a,b) does. The distance from such a 
point to LF is less than b+ -(ra+rm-+c) = (b-—ra—c) —(rm—n). From 
the lemma above, we know that we can make (rm — 7) as close to (b — ra — c) 
as we like and we can do it with arbitrarily large values of |m|. It is now easy 
to create a sequence of points in L that is discrete in LC but whose images under 
projection to S! x S' converge to the image of (a,b). This allows us to make two 
conclusions. The first is that the image of L is dense in S' x S!. The second is 
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that the projection restricted to L does not carry L homeomorphically onto its 
image. For let « be a point of LZ and let x; be a sequence of discrete points in L 
whose image converges in S! x S! to the image of x. The inverse map from the 
image of LZ to L cannot be continuous since it will not preserve the limit of the 
convergent sequence. The problem with the projection restricted to L is that while 
it is a one to one continuous map, it is not open. 

To argue that the image of L is not a submanifold of S! x S' we note that any 
open set around a point in the image has its intersection with the image dense in 
the open set. But the definition of submanifold would demand a coordinate chart 
(U,@) in which the intersection of the image of L with U would definitely not be 
dense in U. 

We have constructed an example of an injective immersion that is not a home- 
omorphism onto its image and whose image is not a submanifold. A much easier 
example is an injective immersion of the open unit interval into the open unit disk 
in R® so that its image is homeomorphic to the numeral “6.” These examples lead 
to a definition and a lemma. We say that an immersion that is a homeomorphism 
onto its image is an embedding. 


LEMMA 11.2. Let N be a C™ manifold, r > 1. A subset A of N is a C™ 
submanifold if and only if A is the image of a C” embedding. 


PrROooF: The forward direction has been argued above. We consider the reverse 
direction. Let A be the image of the C” embedding f: M— N. A point x in A 
has an open neighborhood U which is the image of an open V in M. The set U 
is of the form U'N A where U’ is open in N. From the Immersion Theorem, there 
is an expression of f in local coordinates based on charts contained in U’ and V 
that gives exactly the structure needed for a submanifold chart around «a. 


In the above, we exploited the fact that the expression in local coordinates guar- 
anteed by the Immersion Theorem gives a structure that fits the definition of a 
submanifold chart. We can also look at the expression in local coordinates that is 
guaranteed by the Submersion Theorem. Here we are looking at the projection of 
R” onto the subspace spanned by a subset of its coordiante axes. The preimage of 
0 under this projection (the kernel) lies in R” exactly as required by the definition 
of a submanifold chart. That makes the next lemma an easy exercise. 


LEMMA 11.3. Let f: M—N bea C" map, r>1. If y € f(M) is a regular 
value, then f~'(y) is a C” submanifold of M. 


There is no “only if” in the above. There are submanifolds that are not the 
inverse images of regular values under any map. The center line DL of the MGbius 
band M does not separate any neighborhood of itself in M. (We have not dealt 
with manifolds with boundary, so we consider M to be the open Mobius band.) 
For L to be the inverse image of a regular value, there has to be a submersion to 
a manifold of dimension 1. But every point in a manifold of dimension 1 separates 
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some neighborhood of itself. [Exercise: the centerline L of the Mébius band M is 
the inverse image of a critical value of a function f: M—-> R.] 

It should be noted that there is nothing in the definition of a submanifold that 
requires it be a closed subset of the manifold that contains it. Some like to include 
a requirement that submanifolds be closed subsets. Exercise: find an example of a 
submanifold of R? that is not a closed subset. 

We end this section with some notation. We have been using T, to denote the 
tangent space to a manifold at x. Until now this has offered no opprotunity for 
ambiguity since the manifold in question was always the unique manifold containing 
z. Now that one manifold can be a submanifold of another, the notation is not 
specific enough. We will continue to use it when there is no problem. There are 
two notations that are standard to resolve the ambiguity. One is to use M, to 
denote the tangent space to M at x and the other is to use TM to denote the 
same thing. We will use the first when needed because it is one less character to 
type. 

It is important to note that if M is a C" submanifold of N and x € M, then M, 
is a vector subspace of N, and that if i: M — N is the inclusion map, then Di, 
is the linear inclusion of M, into N,. This is straightforward from the definitions 
of “submanifold”, M,, Nz», and Diz. 


12. Bump functions and partitions of unity. 


This section introduces two very powerful tools available when working with 
differentiable functions. One typical way that they are used is to deduce global 
information from local information. Before we give sample applications, we have 
to develop the techniques. 

Consider the function : 

{ e 7, t>0 


Sai, aay. 


Before we look at properties of f, we show 


as 1 
t 


(17) fie, 
t04 tr 
Replacing t~! by x lets us rewrite (17) as 
lim = lim — 
roo GT” a—oo ev 


which is shown to be 0 by L’H6pital’s rule. The first consequence of (17) is that f 
is continuous. 

We note that f’(t) =0 for negative t. We now discuss f'(x) for positive ¢ and 
assume that t > 0 for the rest of the paragraph. The function f has the form e% 
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where g is the function g(t) = ¢~!. It is the case that higher derivatives f(”) (t) 
have the form (e%)(P(g)) where P(g) is a polynomial combination of derivatives 
of g. This is easily shown by induction and the chain rule. It is also proven by 
induction that derivatives of g are polynomial combinations of negative powers of 
t. Thus f(")(t) is of the form (e%)(Q(t)) where Q(t) is a polynomial in negative 
powers of t. By (17) we now have 

ae FO =0. 

Thus if we show that f(")(0) = 0 for all n, then f is C*°. But to show that 
f()(0) = 0 inductively from the definition of the first derivative, we are reduced 
to showing that 

fee am) 
lim ——— 


t>04 t 


=0 


which follows from (17). 

Note that while f is C@, it is not analytic at 0. No power series can give the 
constant function 0 to the left of 0 and simultaneously the non-constant function 
e—'/* to the right of 0. There is a notion of an analytic manifold based on coor- 
dinate charts with analytic overlap maps. They are harder to work with since the 
techniques of this section are not available with these spaces. 

We can build various interesting functions from f. 


Let 
(Oa 
f(t) + fa—t) 

The denominator is never 0 since ¢ and 1 —t are never simultaneously negative. 
Thus g; is C®. Now gi(t) =0 for ¢< 0, 0 < g(t) <1 for t¢ > 0 and g(t) =1 
for t > 1. Setting go(t) = gi(t — 1) and g3(t) = go(—t) give C™ functions where 
gz is 0 on (—oo, 1] and 1 on [2,00) and g3 is 1 on (—oo, —2] and 0 on [—1,0o). 
Thus if h(t) = 1 — (go(t) + ga(t)), then 0 < A(t) <1 for all t, and A(t) is 1 when 
|t} <1 and 0 when |t| > 2. The function h is typically called a bump function. 

Higher dimensional versions can be constructed. Consider the function ¢: R™ > 
R defined by 

O(21,+++,2m) = A(xi)h(x2)-+- h(n). 

The function ¢ is C™, has its values in [0,1], takes on the value 1 on [—1, 1]” 
and takes on the value 0 off (—2,2)™. Clearly ¢ can be adjusted so that given 
an € > 0, the boxes [—e,«]™ and (—2e,2¢)™ replace [—1,1]™ and (—2,2)™. Also, 
these boxes can be centered at points other than the origin. This is worth noting 
as a lemma. We introduce some notation to make this lemma and later lemmas 
easier to state. 

Let C CU be a closed set in an open set in a C” manifold MM. We say that a 
C” function 6: M — R is a bump function for the pair (U,C) if f(M) C [0,1], 
f(C) = {1}, and f(M —U) = {0}. So far we have shown: 
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LEMMA 12.1. Let « > 0 be real. Let x = (%1,...,%m) € R™. Let 
CHa Mhiseisln) eR lees 7s ee, Lea sat 
and let 
U ={(y1,---5Y¥m) € R™|a; — 26 < yj; < aj, +26, 1<i<nt}. 


Then there is a C® bump fucntion for (U,C). 


Now let K C U be acompact set in an open set ina C” m-manifold M. Let x lie 
in the domain of a coordinate function @. Then in the domain of # we can arrange 
x € Cy € U, where U, lies in the domain of 8, where 6(C,) is a box of diameter 
dy centered at 6(x), and where @(U;) is a box of diameter 26, centered at 0(z). 
Note that this forces x to be in the interior of C,. By composing 6 with a C® 
bump function for the pair (@(U,),9(C,)) we get a C” bump function for (U,,C,) 
that is defined on the domain of the coordinate function. We extend the bump 
function to a function ¢, defined on all of M be letting ¢, be 0 off the domain of 
the coordinate function. This extends all the relevant derivatives continuously since 
they all vanish off U,. The interiors of the C, form an open cover of K from which 
a finite subcover can be extracted. Let the corresponding “centers” be {z1,..., 25} 
and let the corresponding (U,C) pairs be denoted (U;,C;), 1 <%i<.s. For each 2, 
let ¢; be the bump function above for (U;,C;). Now if we define ®: M > R by 


®(x) = S° ¢i(z), 


then © is non-negative and C” and &(z) has strictly positive values on K and is 
0 off U. This is not exactly a bump function because we have no control on the 
exact values of ® on K. We can improve on this if desired. We will need what we 
have just proven in order to get to the improvements so we state it as a lemma. 


LEMMA 12.2. Let K CU CM where K is compact and U is open and M is a 
C” manifold. Then there is a C” function from M to R taking values in [0, oo), 
taking the value 0 off U and strictly positive values on K. 


In order to get more, we need the notions of paracompact and partition of unity. 
A topological space is paracompact if every open cover of the space has a locally 
finite open refinement. A refinement of a cover is another cover so that every 
element of the refinement is contained in some element of the original. A cover 
is locally finite if every point of the space has a neighborhood that intersects only 
finitely many elements of the cover. The following are proven in Section 6-4 of 
Munkres: 


THEOREM 12.3 (STONE’S THEOREM). Every metric space is paracompact. 
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THEOREM 12.4. Every paracompact space is normal. 


The first result applies here because we are only looking at metric spaces. The 
second result applies as well, but a direct proof that metric spaces are normal is 
much easier than going through Stone’s theorem. 

Let f : X — [0,c0) be amap. The support of f is the closure of the pre-image 
of (0,00). If O is an open cover and {¢,} is a collection of functions from X to 
[0,00), then the collection of functions is a partition of unity subordinate to the 
cover O if the collection of supports of the ¢, is a refinement of O, if for all x 


S- ¢0(2) =1 


and if the sum involves only finitely many non-zero terms for each x. Since the 
values of the functions are never negative, they can never exceed 1. Note that 
even if O is locally finite, there might be infinitely many non-zero terms in the sum 
without the extra assumption that this does not happen. The following modification 
of the definition of partition of unity is used to make the finiteness automatic if 
O is locally finite. If O = {Ua}aes is the open cover, then the partition of unity 
{¢a}aces is dominated by O if the support of ¢, lies in U, for each a € J. 

We will not prove Stone’s Theorem. There is a perfectly good proof in Munkres. 
It takes about three pages there. We will look at some consequences. We will show: 


THEOREM 12.5. Every open cover of a C” manifold dominates a C” partition of 
unity. 


This will take several steps. We will need various technical lemmas along the 
way, as well as partial results. 


LEMMA 12.6. A locally finite open cover of a separable space has countably many 
non-empty sets. 


PROOF: The wording of the statment is to allow a given indexing set to be used 
for a cover even if some (or most) of the index values refer to empty sets. 

Pick a countable dense subset S. Locally finite implies the weaker point finite, 
that every point in R™ lies in a finite number of elements of the cover. Since every 
non-empty open set contains a point in S, a list of the elements of the cover that 
contain each point in S will list all the non-empty elements of the cover. But each 
point in S lies in finitely many elements of the cover, so the list is countable. 


LEMMA 12.7. A point finite, countable open cover {U;} of a normal space X has 
a refinement {C;} of closed sets whose interiors cover X and with each C; C U;. 


Proor: Assume that {C,,...,C,} have been found so that each C; is closed and 
in U; and so that the interiors of the C; and the U; for 7 >n cover X. Let C},,, 
be X minus the interiors of all the C;, i <n, and minus all the U;, 7 > (n+ 1). 
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This is a closed set. Since the only set not removed is U,4, and removing U,4+1 
would yield the empty set, we have Ci,,, C Un41. Now because X is normal, 
there is a closed set Cp41 in Un41 whose interior contains Ci, 41. We now have 
our assumption with n replaced by n+1. In this way we inductively end up with a 
collection {C;}. To argue that the interiors cover, we note that every x € X lies in 
finitely many U;. After a finite number of steps, these U; will have been replaced 
by C;. By our assumption, x must lie in one or more of the interiors of the C;. 


LEMMA 12.8. Every open cover {Ua}acys of a paracompact X has a locally finite 
open refinement {W.}acez where each W, C U4. 


ProoF: Note that various W, may be empty. Let {Vg}gex be a locally finite 
open refinement. Chose a function f : K — J so that each Vg C Ura). Now form 
{Wahaes by setting W. to be the union of those Vg for which f(G) =a. This is 
an open refinement since each W, is a union of open subsets of Uy and since each 
Vg is used in some W,. Since each Vg is used in only one Wy any neighborhood 
hitting only finitely many Vg hits only finitely many W.. Thus {Wa }ae, is locally 
finite. 


LEMMA 12.9. Every open cover of a C” manifold M by sets with compact closure 
dominates a C” partition of unity. 


PROOF: Wecan replace the given cover by a locally finite open refinement using the 
same indexing set as the original. A partition of unity dominated by the new cover 
will be dominated by the original. The new cover has countably many non-empty 
sets. Since it is a refinement of the original the elements have compact closure. 
Let the non-empty sets in the cover that we are working with be {V;}. We can 
extract a closed refinement {C;} whose interiors cover. Since each C; is closed 
in a compact set, it is compact. By Lemma 12.2, we now have C” non-negative 
functions ¢; from M to R with each ¢; strictily positive on C; and zero off V;. 
Thus the supports of the ¢; are locally finite and the sum 5* ¢;(z) is defined for 
each x. Since the interiors of the C; cover M, the sum )> @;(z) is never 0. Now 


we let 
(a 
®;(x) = _9i(%) 
27 9;(@) 
The collection of the ®; is now a partition of unity dominated by the {V;}. To get 
a partition of unity for the original indexing set, let the function for those indexes 
of empty sets be the constant function to 0. 


The next lemma gives the promised improvement to Lemma 12.2. It also leads 
to a proof of Theorem 12.5. 


LEMMA 12.10. Let CCV CM where C is closed and V is open and M is a C™ 
manifold. Then there is a C™ bump function for (V,C). 


AT 


PROOF: By using coordinate charts, we can cover C by open subsets of V with 
compact closure. Let U = M—C. We can also cover U by open subsets of U 
which also have compact closure. These two covers together will cover M/. Let ®, 
be a C” partition of unity dominated by the cover. The sum of all the elements 
of the partition that satisfy the restriction that they correspond to open sets that 
intersect C' gives usa C” function. It is the function we want since all the supports 
are in V and since all the functions omitted by the restriction have their supports 
in U and are not contributors to the fact that the sum is 1 on C. 


PROOF OF THEOREM 12.5: The proof is exactly the same as the proof of Lemma 
12.9 except that Lemma 12.10 is used instead of Lemma 12.2. 


We now give two applications. The first is an example of the use of bump 
functions, and the second is an example of the use of partitions of unity. They both 
deduce global information from local information. 

The definition of a C’ manifold states that locally the manifold has C” embed- 
dings into a Euclidean space. If the manifold is compact, then we can use partitions 
of unity to guarantee the existence of a C” embedding of the entire manifold into 
a Euclidean space. 


LEMMA 12.11. Let M be a compact C” m-manifold, r > 1. Then there is an 
integer n and an embedding f : M— R”. 


PROOF: Since M is compact, there is a finite cover of M by coordinate charts 
(U;,9;), 1 <i<k. We can extract a closed cover {C;} with each C; C U; and 
with the interiors of the C; covering M. For each i, let 6; : M@ — R be a bump 
function for the pair (U;,C;). Each 0; :U; — R™ is an embedding. Define 


gi: M 3 R™” x R=R™"! by gi(x) = (¢;(2)6; (2), o;(2)). 


Now let 
G2 Gis Gee SRO Xe ROS RI, 


Now g is C’. If « € C;, then g; is an immersion at x since the first coordinate of 
g; on C; is 8;. Thus no tangent vector at x is taken to zero by Dg; and thus not 
by Dg since the Dg; go into independent subspaces of TR'™+1) | To see that g 
is an injection, consider « # y. If x and y lie in one C;, then g(x) # g(y) again 
since the first coordinate of g; is 9; which is injective on C;. If x € C; and y ¢ C; 
then the second coordinate of g; disagrees on x and y and g(x) 4 g(y). So g is 
an injective immersion and thus an embedding. 


REMARK: The result above gives no where close to the best estimate on the dimen- 
sion of the Euclidean space needed to receive the embedding. There is an argument 
that shows that the embedding can take place in R?"*'. A much more difficult 
argument shows that the embedding can take place in R?”. 
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Now for the second example. Let M and N be C™ manifolds and let C bea 
closed set in M. Let f : C > N be a function. We say that f is C” if for every 
xz in C, there is an open set U in M about x and a C” function fy :U > N so 
that flunc = fulunc. 


LEMMA 12.12. A function f : C — R” where C is a closed subset of a C™ 
manifold M is C” if and only if there is an open set U in M about C and a C” 
function fy :U > R” so that f = fule. 


ProoF: For the “if” direction, use U for every x. 

Now if f is C’, then there is a cover {Uz},ec of C by open sets of M and C” 
functions f, that extend the various f|cnu,. Let V = M —C and let a partition 
of unity dominated by the open cover {Uz},ec U{V} of M consist of functions 


denoted ¢, and ¢@y. Now 
Yo bef 


rEeC 
is C”, is defined on all of M, and equals f on C. 
13. The C! metric. 


The tangent vectors to a manifold M are defined as equivalence classes of curves. 
Curves are maps from subsets of R to M. The set of curves can be formed into a 
topological space (function space) in many ways. We are familiar with some. Once 
the set of curves is formed into a function space, we can use a quotient topology 
on the set of tangent vectors. It turns out that the function space topologies that 
we are familiar with (e.g., uniform topology, uniform convergence on compact sets, 
etc.) will give bad topologies on the set of tangent vectors. In particular the 
quotient topologies are not Hausdorff. This is not hard to see, so we will go into 
some detail. 

The function space topologies that we know give some control on the values of a 
function. An open set of functions can be defined that will force any function in this 
open set to have its values on some restricted part of the domain to be near a given 
value in the range. For example, the compact open topology can be used to build 
an open set O of functions where the values on a compact subset in the domain 
are constrained to lie in a neighborhood of a given value in the range. But this 
will not control the derivative. One can build functions in O that race around the 
range neighborhood like mad giving arbitrarily large values for the derivatives at 
given points, and there will be functions in O that will stall at various points (see, 
for example, the bump functions of Section 12) giving low values of the derivative 
(even 0) at those points. 

A curve identifies a particular tangent vector in TM by seeing what the value of 
the curve is at 0 (this identifies which T, we are in) and what its derivative is at 
0 (which identifies which v in T, we are looking at). The topologies that we know 
build open sets of curves in which the values of the curves at 0 are near a certain 
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point. For such an open set O of curves, the set of tangent vectors defined will lie in 
a set of tangent spaces T,, where the points x are confined to some neighborhood 
W in M. However, the derivatives of the curves in O will take on all possible 
values at 0. The set of tangent vectors defined by the curves will thus be the union 
of all the T, for  € W. Taking unions and intersections of these sets of curves 
will still give sets that represent entire copies of the tangent spaces T,. Thus the 
topologies that we know on the set of curves will allow us to separate points in M 
by open sets but not vectors in any one T,. 

We now discuss how to control the derivative. The problem that we are working 
on is the structure of TU where (U,@) is a chart of a C” m-manifold M. We will 
use the coordinate function as a tool. This is reasonable since it is the coordinate 
function that sets up the one to one correspondence between TU and 9#(U) x R™ 
in the first place. Also, a curve f : J ~ U, where J is an open interval about 0 in 
R, can be composed with 6 so that both its values and its derivatives are elements 
of R™. 

We will use the metric on R™ to imitate the construction of the uniform metric. 
The easiest way to make use of the metric is to take supremums. If we have a 
compact domain, then our formulas are a little simpler since we don’t have to bound 
distances by 1 all the time. Thus we restrict ourselves to the “unit disk” [—1, 1] 
in R and use this for our domain for all curves. Since the relevant information 
about a curve is its value and derivative at 0, this will suffice. For the rest of this 
section, let I deonte the interval [—1,1] in R. When we discuss the derivative of 
a function defined on J, we will use the right hand derivative at —1 and the left 
hand derivative at +1. 

Let d be the metric on R™. Let C!(I,U) be the set of C! functions from I to 
U. Let f be an element of C!(I,U). To simplify notation, we let f denote 00 f. 
This is a curve into R™. For f and g in C'(I,U) define 


p(f,g) = max[sup{d(f(x), 9(x))|x € I}, 
sup{d(f'(x), 9'(x))|a € I}. 


This can be compared with the uniform metric defined near the top of page 266 of 
Munkres. 
Certain calculations go through exactly as they do for the uniform metric. 


LEMMA 13.1. The function p is a metric. 
Call this the C! metric on C1(I,M). 


LEMMA 13.2. A sequence f, : I > U of C! functions converges to the C! function 
f in the C! metric if and only if the sequences f, and f!, converge uniformly to 
f and f' respectively. 


00 


In the next section, we will discuss the quotient topology that the C! metric 
induces on TU, and show that with this topology, the one to one correspondence 
6:TU > AU) x R™ of Section 6 is a homeomorphism. 

Before we end this section, we want to show that the C! metric has reasonable 
properties. The lemma above tells only what happens if convergence in the C! 
metric takes place. It says nothing about how often it happens. It may be rare 
for a sequence of functions with limit f to have the corresponding sequence of 
derivatives converge to f’. In fact, it is not rare. If U is complete, then C!(I,U) 
is complete. For simplicity, we will show this in the case that U is R™. 

Much of the argument is familiar. If f, is a Cauchy sequence in C1(I,R™), then 
for each x in I, f,(x) is Cauchy and f/ (az) is Cauchy. Since R”™ is complete, 
there is a limit for each f,(a) which we can call f(z) and there is a limit for each 
f(x) which we can call g(x). It would be alittle premature to call g the derivative 
of f. Since the definition of C! demands continuous derivative, the f, and the fi, 
are all continuous. A uniform limit of continuous functions is continuous, so f and 
g are continuous. Since the convergence f/, > g is uniform, there is a tail of the 
sequence that is within « of g. So every member in this tail satisfies 


(g(a) — €) < fr() < (g(x) + €) 


for each x in I. If K is the maximum of g on J, then on this tail | f/(a)| < K +e 
for all x in I. Thus the tail satisfies the hypotheses of the dominated convergence 
theorem for integrals. (Our functions are integrable since they are continuous.) We 
get 
[aati ff =tim(Ful@) - fa) = F@) - 1 

for all x in I which demonstrates that f' = g. This finishes the argument. 

[There is another argument that shows that f’ = g based on the Mean Value 
Theorem and direct computation of the derivative. We give it here for those un- 
compfortable with the use of the dominated convergence theorem. It is nice in that 
it can be applied when the defintion of the C! metric is generalized to functions 
from R™ to R” instead of just functions defined on I. 

Given ¢ > 0, we wish to find a 6 > 0 so that ||Al| < 6 implies 


I|f(@ +h) — f(x) — g(@)hl| < [|All 
Now 
Ilf(e +h) — f(x) — g(aall S$ ||F@ + h) — fala + hl 
+ ||fa(@ +h) — fala) — ff, (2)All 
+ ||fn(x) — F(2)II 
+ ||fr(a)h — g(a)hll. 
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The fourth term on the right is the difference of two linear functions to R™ evalu- 
ated at the same point. (Actually in our setting it is the difference of two function 
values multiplied by the same displacement.) Thus for a fixed value of h, we can 
make the first, third and fourth terms on the right as small as we like, say less than 
n/3, by using the uniform convergence of f, to f and f! to g by keeping n large 
enough. Thus if the second term is shown to be less than e||hAl|, then we will have 


I F(a + h) — F(a) — g@)hll S ellAll +0 


which can be made to hold for any 7 by chosing n large enough. Thus we will have 
shown 


Il F(a + h) — F(a) — g@)All S ellAll. 


We now concentrate on how to show 
(18) Ilfn(z +h) — f(x) — fr(x)hll < €llall. 


Note that (18) can be made true for each n by restricting A differently for each 
n. However, we need to show once fA has been chosen sufficiently small, that (18) 
is true for all sufficiently large n. 

We note that as a function of h, the expression f,(z + h) — fn(x) — fi (a)h is 
equal to 0 when h = 0. Thus we are asking how much f,(« +h) -— fn(x) — fi (ah 
varies from its value at h = 0 for a given value of h. This is where we apply the 
Mean Value Theorem. 


Let 
b(t) = fala + th) — fa(x) — f(x) (ht). 
We have 
(19) Ilfn(@ + h) — fn(x) — f,@)All = |]6Q) — 60)I- 


We can estimate this by using the Mean Value Theorem. 

We will have to take some derivatives. We are already mixing them up pretty 
well (f'(x) versus Df, ), so we will stick to the “prime” notation and regard the 
expression f/(x)(ht) as the constant f/(xz) (it does not depend on ¢) multiplied 
by At. Now we have 


o (t) = fr(w + th)(h) — fr (@)(h) = (fn(a + th) — fr(@))(h) 
by the chain rule. Now 


Il fn(w + th) — fF (a)Il SIF, (@ + th) — g(@ + th) 
+ llg@ + th) — 9(x)I| 
+ llg(@) — f(a) II- 
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The first and third terms can be kept less than ¢/3 by unifom convergence and 
keeping n sufficiently large. The middle term is where we get our 6. We chose 6 
to keep the middle term less than €/3 whenever ||h|| < 6 which can be done by 
the continuity of g and the fact that ¢ is restricted to lie in [0,1]. Now we have 
|¢’(t)|| < el|A]| for ¢ in [0,1]. By the Mean Value Theorem, the right side of (19) 
is less than e||h||(1 — 0) and we have shown that (18) holds.] 


14. The tangent space over a coordinate patch. 


We continue the discussion of the previous section. We have a C™ m-manifold M 
with a coordinate chart (U,@). We have the one to one correspondence 6: TU + 
6(U) x R™ as defined in Section 6. We have that TU is a quotient of C1(I,U) 
and we have the C! metric p on C1(I,U). This gives the quotient topology on 
TU. We wish to show that 6 is a homeomorphism under this topology. 

First we show that 6 is continuous. Let ¢ > 0 be real. We want a 6 > 0 so 
that if f,g €¢ C'(I,U) have p(f,g) < 6, then d(6[f],6[g]) < «. Here we need to 
decide on the metric on 0(U) x R™. We decide on the metric d((a, 6), (c,d)) = 
max{d;(a,c),d,(b,d)} where d, is the metric on 0({U) C R™ and on R™. We 
make this choice because it makes the next argument a triviality. 

Now p(f,g) <6 implies that (60 f)(0) and (@0g)(0) differ by less than 6 and 
(0 0 f)'(0) and (80 g)'(0) differ by less than 5. So d(@[f], 8[g]) < 5. We now let 
6 =e and are done. 

Now we show that 9 is open. Suppose that S C TU is open. We want to show 
that 0(S) is open in 0(U) x R™. Let [f] € S. We want a 6 > 0 so that if (a, y) 
is within 6 of 6[f], then there is a [g] in S go that @[g] = (x,y). Since S is open 
in TU, it is the image of an open set in C1(I,U). Thus there is an e€ so that if 
p(f,h) <e, then [h] isin S. We argue that letting 6 < €/2 will work. 

Let (x,y) be within 6 of 6[f]. The notation is easier with displacements, so let 
u=a—(80 f)(0) and let v = y — (80 f)'(0). Consider 


gi(t) =ut (60 f)(t) + tu 


defined on J. We ignore for a minute that the range of g, might not be in 6(U). 
We have gi (0) =u+ (#0 f)(0) =a and gi (0) = (80 f)'(0) +u = y. So if the range 
of g, is in 0(U) we are done by letting g(t) = @~! og; so that O[g] = (a, y). It is 
easy to show that p(f,g) < € so that [g] is in S. We now modify g; to get a go 
with similar properties but whose range is in 0(U). 

We first take 6 smaller if necessary so that the 6 ball B around (00 f)(0) lies 
in 0(U). There is a straight line homotopy from (60 f) to g, defined by 


F(t,s) = su+ (00 f)(t) + stu 


where s € [0,1]. The homotopy goes into R™ but not necessarily into @(U). Now 
F (0,0) = (60 f)(0) which is in the center of the ball B. Also F(0,1) = g\(0) =a 
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which is within 6 of (8 0 f)(0) and so is also in B. Since the homotopy is the 
straight line homotopy, the straight line {F(0,s)|s € [0,1]} is also in B. By the 
continuity of F’ and the compactness of [0,1], there is an 7 so that F(t,s) lies in 
B for s € [0,1] and t € [-7, 7]. Let 6: I — [0,1] be a bump function which is 1 
on [—7/2,n/2] and 0 off [—n, 7]. Now let 


go(t) = o(thu + (Go f)(t) + o(iu. 


On [—n/2,7/2] we have go = g,. This guarantees that go(0) = g:(0) and g5(0) = 
g,(0) so that g(t) = 07! 0 g also has 6[g] = (2, y). It is again easy to show that 
p(f,g) <«€ so that [g] isin S. Off [—7,7] we have go = (00 f). This guarantees 
that the image of go off [—7,7] lies in 0(U). On [—n,7] we have that the image 
of gp lies in the image of F on [—n,7] x [0,1] which lies in B. This completes the 
argument. 


15. Approximations. 


None of the statements in this section will be proven. 

Just as one can define the C! metric, one can define the C” metric for any r > 1 
and also a C™ metric. These are for functions with range in some Euclidean space. 
For maps to an arbitrary manifold, it is harder to make well defined measurements, 
so one defines C” topologies and C' topologies instead of metrics. Once a topology 
is established, then questions about open, closed, compact and dense sets can be 
discussed. A statment that a set of functions is an open set in a topology says that 
if a function has the defining property of the set, then all nearby functions have the 
property. A statement that a set of functions is dense says that any function can 
be approximated by a function in the set. 

There is more than one C” topology to chose from. There is the “weak” topology 
and the “strong” topology and there are perhaps others. The weak and strong 
coincide for a compact domain. We do not provide definitions. The results below 
leave out which of the C” topologies are being used on the function spaces. 

Many of the approximation results are proven locally first and then extended to 
global results using bump functions or partitions of unity. As an exercise, one can 
show that C'° functions are dense in the continuous functions using the uniform 
metric by approximating a continuous function by constant functions on small sets 
and then using partitions of unity to smooth things out. 

Consider the next two results. 


LEMMA 15.1. Let M be aC” m-manifold, 2 <r < oo. Then, in the space of 
C” functions from M to R” with the C" topology, the embeddings are dense if 
nm > 2m and the immersions are dense if n > 2m. 


THEOREM 15.2. Let M and N be C” manifolds of dimension m and n repsec- 
tively with 2<r< oo. Ifn > 2m, then the immersions of M into N are dense 
in the C” maps from M to N with the C” topology. 
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The proof of the second result will use the first to get approximations on charts. 
Then bump functions will be used to piece together an apparently incompatible 
collection of pieces of aproximations. 

An openness result is: 


LEMMA 15.3. In the space of C’ maps with the C" topology, r > 1, between 
manifolds, the immersions, the submersions and the embeddings each form an open 
set. 


A main approximation theorem is: 


THEOREM 15.4. Let M and N be C* manifolds, 1 < s < oo. Then the C* 
functions from M to N are dense in the C” topology on the C” functions from 
M toN for0<r<s. 


Approximations are also used to increase the differentiability of a differentiable 
structure on a manifold. A typical result in this direction is quoted above as 
Theorem 8.1. 


16. Sard’s theorem. 


Regular values of C” maps are nicer than critical values. Recall Lemma 11.3 
which says that the inverse image of a regular value is a submanifold. It turns out 
that regular values are dense in the range. The idea behind this is that critical 
points are places where the map is squashing the domain more than required to 
fit into the range. The image of such squashing cannot occupy much of the range. 
This is the content of Sard’s theorem. It turns out to have many applications. It 
also turns out to be rather delicate to prove. We will prove a very special case to 
illustrate some of the ideas. We will mention an application of the full theorem in 
the next section. 

The fact that it is delicate to prove is supported by the fact that it is false without 
the proper restrictions. There is a C! map from R? to R whose set of critical 
values includes an interval. Thus the regular values cannot be dense in the range. 
In fact the map is quite strange. A critical point in a map from R? to R! can 
only be one at which the derivative is the zero linear map. That means that the 
tangent plane to the graph is horizontal. The map has the property that there is 
an arc of critical points in R? whose image in R!' is an interval. Thus there is a 
path in the graph which rises in spite of the fact that there is a horizontal tangent 
to the graph at every point along the path. 

To properly state Sard’s theorem, we need some defintions. A cube of side a in 
R” is a translate of [0,a]” = {(a1,...,%n)|0 < x; < a}. The volume of a cube of 
side a in R” is defined to be a”. We denote the volume of the cube C by p(C). 
One can similarly define the volume of a rectangular solid. A set A in R” is said 
to have measure 0 if, for every € > 0, it can be covered by a countable collection 
of cubes whose volumes sum to less than ¢€. Countable unions of sets of measure 
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O have measure 0. Thus checking that a set has measure 0 can be done on small 
open sets. It is provable that an open set cannot have measure 0. Thus a set of 
measure 0 can contain no open set and thus has dense complement. It turns out 
that the regular values are more than just dense. A set is called residual if it is 
the intersection of a countable collection of dense open sets. The Baire category 
theorem (which applies to R” since it is a complete metric space) says that a 
residual set is dense. However, there are dense sets (e.g., the rationals in R) that 
are not residual. 

We have only defined sets of measure 0 in R”. We define a set. to have measure 
0 ina manifold M if the intersection of the set with the domain of each coordinate 
map has its image under the coordinate map a set of measure 0. That this defintion 
makes some sense is supported by the next lemma. 


LEMMA 16.1. Let U be an open set in R” and let f : U > R” be aC! map. If 
X CU has measure 0, then so does f(X). 


PRoor: Because f is C1, ||Df,|| is bounded on compact sets. Thus on a ball B, 
we have a bound K for ||Df,|| and 


IlF(z) — F@)II S Alla — yl 


for any x and y in B. In acube C of side a, the distances are bounded by an. 
Thus the distances in f(C) are bounded by ak./n. Let L = K./n. We have that 
f(C) is contained in a cube of side no more than aL with volume no more than 
aL” = L'yw(C). 

Since X can be covered by countably many balls and contable unions of sets of 
measure 0 have measure 0, we need only prove the lemma for X 9 B. Now given 
€ > 0, we can cover X MB by cubes whose volumes add up to less than ¢€. Thus 
f(X MB) can be covered by cubes whose volumes add up to less than Le. But 
L” is fixed for this B and we can make the image sum as small as we like. This 
completes the proof. 


The full statement if Sard’s theorem is: 


THEOREM 16.2 (SARD’S THEOREM). Let M and N be manifolds of dimensions 
m and n repsectively and let f : M — N bea C” map. If 


r > max{0,m—n}, 


then the critical values have measure 0 in N and the regular values are residual in 
N. 


Note that the example claimed above has m = 2, n = 1 and r = 1 which just 
misses the hypotheses of the theorem. There is no such example of a C? map from 
R? to R. The case where r = oo is easier than the full theorem and the proof 
in this case is found in many textbooks. It is also sufficient for most applications 
because approximation theorems (see Section 15) usually allow the assumption that 
all maps are C®. We will prove even less than the full C™ case. We will prove: 
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THEOREM 16.3 (VERY BABY SARD’S THEOREM). Let f : M—> N bea C! map 
between m-manifolds. Then the set of critical points has measure 0 in N. 


PrRooF: A countable union of sets of measure 0 has measure 0 and both domain 
and range can be covered by countable collections of coordinate charts. Thus we 
assume that we are looking at a piece from a coordinate chart to a coordinate chart. 
From the lemma and the defintion, we can assume that we are looking at the map 
expressed in local coordinates. Thus we will assume that f is a C! map from an 
open set U of R™ into R”. 

Let C be acube of side a in U. Again by countable unions, it suffices to consider 
only the image of the critical points that lie in C’. 

We can divide C up into n™ cubes of side a/n. The idea of the proof is this. 
With a/n very small, a constant plus Df will be a very good approximation of f. 
But at a critical point, the image of Df will be a linear subspace of dimension no 
more than m—1. Thus a small cube of side a/n will have extent in the direction of 
this linear subspace that will be approximated by a/n and extent in the direction 
perpendicular to the subspace that will be approximated by ea/n for very small e. 
This will give that the image of the cube has a very small volume. 

Let S be one of the small cubes of side a/n. We have ||y — x|| < /m(a/n) for 
z,y in S. For n large enough, we can get 


IIf@) — f(@) — Diy — 2)|| < elly -— 2] < evm(a/n). 


If S contains a critical point we can choose x to be a critical point. This makes 
the set of points {D f(y — x)|y € S} lie in a linear subspace V of dimension no 
more than m—1 in R™. Thus the set {f(y) — f(x)|y € S} lies within «/m(a/n) 
of V so that {f(y)|y € S} lies within «,/m(a/n) of the translate W = f(x) +V. 
Now ||Df|| is bounded by some K on the cube C. Thus 


If@) — f@)Il < Elly - a] < KVm(a/n) 


and we have that f(y) lies within K./m(a/n) of f(x) and withing e/m(a/n) 
of W. Thus f(S) lies in a rectangular solid where m — 1 of its dimensions 
are 2K./m(a/n) and one of its dimensions is 2e/m(a/n). The volume of S is 
u(S) = (a/n)™ and the volume of f(S) is no more than «eK™ 1(2\/m)™(a/n)™ or 
eK'u(S). Here K' depends on C and not on S. The sum of all u(S) for the n™ 
small cubes in C is u(C). The sum of the volumes of the f(S) for those S that 
contain a critical point is thus no more than e«K'u(C). We can make e€ as small as 
we like by increasing n. Thus the image of the critical points in C has measure 0. 


17. Transversality. 


None of the statements in this section will be proven. 
Let f : M@—N bea C! map and let A C N be a submanifold. We say that f 
is transverse to A if for every x with y = f(x) € A, the tangent space N, of N 
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at y is spanned by A, and Df,(M,). In other words, Ny = Ay+Df2z(M,). This 
is written f th A. We define the codimension of A in N to be the dimension of N 
minus the dimension of A. 

Transversality generalizes the notion of submersion. In a submersion at a point, 
the tangent space in the domain must map to cover the tangent space in the range. 
In a transverse map, the tangent space from the domain may not cover that in the 
range, but it does so with the help of the submanifold that it is transverse to. Note 
that transversality cannot take place if the dimensions of domain and submanifold 
are too small to add up to the dimension of the range. If they are big enough to 
add up, then transversality fails if the image is too “tangent” to the submanifold. 
Transversality says that this degree of tangency does not take place. The map 
x ++ x? is not transverse to the x-axis but it is transverse to the y-axis. 

That transversality is a nice condition is seen by the following. 


THEOREM 17.1. Let f : M — N be aC” map, r > 1, and AC N aC” 
submanifold. If f is transverse to A, then f~'(A) is a C” submanifold of M and 
the codimension of f~'(A) in M is that of A in N. 


This is not hard to show by reducing the theorem locally to a question about 
regular values. 

Niceness is nice and availability is better. The following is a version of the main 
result about transversality. As in previous sections we are not careful about exactly 
which C” topology is being used on the space of functions. 


THEOREM 17.2. Let M and N be C” manifolds and A a C” submanifold of 
N,r>1. Let C"™(M,N) be the space of C™ maps from M to N with the C” 
topology. 

(1) The maps that are transverse to A are residual in C"(M,N). 


(2) If M is compact and A is a closed subset of N, then the maps that are 
transverse to A are also open in C’(M,N). 


The theorem is proven with the help of Sard’s theorem and various of the tech- 
niques discussed in the other sections. 


18. Manifolds with boundary. 


This section is even sketchier. We prove nothing and define nothing. 

The manifolds that we have considered have been modeled on Euclidean spaces. 
The manifolds have had no boundary since each point has to have a neighborhood 
homeomorphic to an open subset of some R™. To achieve boundary we have to 
allow homeomorphisms to open subsets of R’’ the upper half space 


{(@1,...,2m)|Zm > O}. 


Various notions have to be redifined to take the new structures into account. Sub- 
manifolds with boundary of a given manifold will intersect (if their boundaries are 
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transverse) in subspaces that are not even modeled on R1’. They will have corners. 
A technique for rounding corners can be developed so as to avoid building up even 
more variety into the structures. 
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